(12) PATENT 

(19) AUSTRALIAN PATENT OFFICE 


(ii) Application No AU 200042518 B2 

(10) latent No 758454 


(54) 


Title 

Solid phase sequencing of biopolymers 


(51)^^ 


internationai Patent CiasslfiGation(s) 

C12Q 0 01.-^6 8 HOI J 0 4 9/2 6 
C0 7H 0 21/0 0 




(21) 


Application No: 200042518 


(22) Application Date: 2000.06,19 


(43) 
(43) 
(44) 


Publication Date : 20 0 0. 08. 24 
Publication Journal Date : 2000. 08, 24 
AcceptedJournal Date . 2 003 . 03 . 20 




(62) 


Divisional of: 

199655446 




(71) 


Applicant(s} 

Trustees of Boston University 


Sequenom, Inc. 


(72) 


Inventor(s) 

Charles R. Cantor;- Hubert 
Smith : Dong-Jing Fu 


Koeter; Cassandra C 


(74) 


Agent/Attorney 

SPRUSON and FERGUSON. GPO Box 3898, SYDNEY NSW 2001 



; <ALf_ 7Sa45432_,L> 



Solid Phase Sequencing of Biopolymers 



Abstract 

The invention relates to methods for delecting Lintl sequeticiiig target nucieic acid 
st-quences, and doubie- stranded nucleic acid sequences, to nucleic acid probes ^ to mass 
nKjdified nucleic acid probes, to arrays of probes uselUl in these methods md to kits and 
systems which contain these probes. Useful methods involve hybridising the nucJdc acids 
or nucleic acids which represent compiementary or homologous sequences of the iargel to 
an array of iiocleic acid probes. These probes comprise a single-stranded portion, an 
optional double-stranded portion and a variable sequence within the single-stranded portion. 
The molecular weights of the hybridised nucleic acids of the set can be determined by mass 
spectroscopy, and the sequence of the target determined from the molecular weights of the 
iragratsnts. Nucleic acids whose sequences can be determined include DNA or RNA in 
biological samples such as patient biopsies and environmental samples. Probes may be 
rix*?d to a solid support such as a hybridisation chip to facilitate automated motecular weight 
analysis and identification of the target sequence. 



S&FRef: 482783DI 



AUSTRALIA 
PATENTS ACT 1*)90 
COMPLETE SFECIFiCATtON 

FOR A STANDARD PATENT 

ORIGINAL 



Name, and 

Address 

pf Applicants: 



Trustees of Boston University 
1 47 Bay State Road 
Boston Massachusetts 02215 
Unilcd States of America 



Sequenom, Inc. 
1 1555 Sorrento Valley Road 
San Diego California 92121 
United States of Ammca 



Actual 
lnventor(s): 

Address for 
Service: 



Charles R, Cantor, Hubert Koster, Cassandra Smith, 
Dong-Jing Fu 

Spnjisori & Ferguson 
St Martins Tower 
31 Market Street 
Sydney NSW 2000 



Invention. Title: 



Solid Ph-ase Sequencing of Biopolymers 



Tl^e following statement is a full dcKcription of this invention, including the best method of 
performing it kiiown to me/us:- 



SOLID PHASE SEQUENCING OF BIOPOLYMERS 



Background of the Inve ntion 

1 Field of the Invention 

This nwenrion relates lo methods for detecting and sequencing nucieic 
acids ihsijig sequencing by hybndl/-dlion lechiiology and molecular weight analysis. The 
invention also relates lo probes and arrays useful in sequencing md deieclion and lo kits 
and LipparaJus for dctcmiining sequences infonnalion. 
2- Description of the Background 

Since the recognition of nucleic acid as the carrier of the genetic code, :i 
L^rcal de-M of ijilcresl h;is cenicred around determinin.^ Ihe sequence of that code in if.c 
many forms which il is found. Two landmark studies made the process of nucleic acid 
sequencing, at least with DNA, a common and relalively rapid procedure practiced in 
mosl hitx>ratories. The first describes a procciss wbsreby terntinally labebd DNA 
molecules are chemically cleaved m single hase rcpetfiions (A.M. Maxain and \V. GiiberL 
Proc. NatL Acad. Sci. USA 74:560-64, 1977). Each base posnion in the nucleic acid 
sequence is then determined from (he moleculQr weights of iTagmenls produced by pariud 
cleavages, individual reactions were devised lo cieave preferenfially at guanine, at 
adenine, at cytosine and th>ai7nic and at cytosine alone. When the products of Ihese four 
reaclions are resolved by molecular weight, using, for example, polyacrylar^iide ^eJ 
electrophoresis, 



DNA sequences can be read from the pattern of fragments on the resolved 
gei. 

The second study describes a procedure whereby DNA is 
sequenced using a variation of the plus-minus method (F, Sanger et al., Proc. 
Natl. Acad. Sci. USA 74:5463-67, 1977), This procedure takes advantage 
of the chain terminating ability of didcoxynucleoside triphosphates 
(ddNTPs) and the ability of DNA polymerase to incorporate ddNTPs with 
nearly equal fidelity as the natural substrate of DNA polymera^^e, 
deoxynucleosides triphosphate (dhTTPs). Briefly, a primer, usually an 
oHgonucleotide, and a template DNA arc incubated together in the presence 
of a useful ccwicentration of all four dNTPs plus a limited amount of a single 
ddNTP. The DNA polymerase occasionally incorporates a 
dideoxynucleotide which terminates chain extension. Because the 
dideoxynucleotide has no 3 -hydroxy!, the initiation point for the polymerase 
enzyme is lost. Polymerization produces a mixture of fragments of varied 
sizes, all having identical 3' termini. Fractionation of the mixture by, for 
example, polyacrylamide gel electrophoresis, produces a pattern which 
indicates the presence and position of each base in the nucleic acid. 
Reactions with each of the four ddNTPs allows one of ordinary skill to read 
an entire nucleic acid sequence from a resolved geL 

Despite their advantages^ these procedures arc cumbersome 
and impractical when one wishes fo obtam megabases of sequence 
information. Further, these procedures are, for all practical purposes, 
limited to sequencing DNA. Although variations have developed, it is still 
riot possible using either process to obtain sequence information directly 
from any other form of nucleic acid. 



A relatively new method for obtaining sequence information 
from a nucleic acid has recently been developed whereby the sequences of 
groups of contiguous bases are determined simultaneously. In comparison 
to traditional techniques v^^hereby one determines base specific information 
of a sequence individually, this method, referred to as sequencing by 
hybridization (SBH), represents a many-fold amplification in speed. Due, 
at least in part to the increased speed, SBH presents numerous advantages 
including reduced expense and greater accuracy. Two general approaches 
of sequencing by hybridization have been suggested and their practicality 
has been demonstrated in pilot studies, in one format, a complete set of 4" 
nucleotides of length n is unmobilized as an ordered array on a solid support 
and an unknown DNA sequence is hjliridized to this array (K.R. Khrapko 
et aU, J, DNA Sequencing and Mapping i:375-88, 1991), The resulting 
hybridization pattern provides all "^-tuple" words in the sequence. Tliis is 
sufficient to determine short sequences except for simple tandem repeats. 

In the second formal, an array of immobilized samples is 
hybridized with one short oligonucleotide at a lime (Z. Strezoska et aL, Proc. 
l^atlAcadSci. USA 88:10,089-93. 19911 When repeated 4" times for each 
oligonucleotide of length n, much of the sequence of all the immobilized 
samples would be determined. In both approaches, the intrinsic power of 
the method is that many sequenced regions are determined in parallel. In 
actual practice the array size is about 10"^ to 10^ 

Another aspect of ihe method is that information obtained is 
quite redundant and especially as the size of the nucleic acid probe grows. 
^Mathematical simulations have shown that the method is quite resistant to 
experimental errors and thai far fewer than all probes are necessary to 



detennme reliable sequence data (P. A. Pevzner et al, J. BiomoL Stmc, & 
Dyn, 9:399-410, 1991; W. Bains, Genomics 11:295-301, 1991). 

In spite of an overall optimistic outlook, there are still a 
number of potentially severe drawbacks lo actual implementation of 
5 sequencing by hybridization. First and foremost among these is thai 4^^ 
rapidly becomes quite a large number if chemical synthesis of all of the 
oligonucleotide probes is actually contemplated. Various schemes of 
automating this synthesis and compressing the products into a small scale 
array, a sequencing chip, have been proposed, 
^0 There is also a poor level of discrimination between a 

correctly hybridized, perfectly matched duplexes^ and end mismatches. In 
part, these drawbacks have been addressed at least to a small degree by the 
method of continuous stacking hybridizauon as reported by a Khrapko et al, 
(FEBS Lett. 256: 11 8-22, 1989). Continuous stacking hybridization is based 

15 upon the observation that when a single-stranded oHgonucleotide is 
hybridized adjacent to a doubk^su^ded oligonucleotide, the two duplexes 
are mutually stabilized as if they are positioned side-to-side due to a 
stacking contact between them. The stability of the interaction decreases 
signiftcantly as stacking is disrupted by nucleotide displacement, gap or 

20 teniiinal mismatch. Intemai mismatches are presumably ignorable because 
their thermodynamic stability is so much less than perfect matches. 
Although promising, a related problem arises wdiich is the inability to 
distinguish between weak, but correct duplex formation, and simple 
backgrmjnd such as non-specific adsoiption of probes to the underlying 

25 support matrix. 



Detection is also monochromatic wherein separate sequentiai 
positive and negative controls must be run to discriinijiate between a correct 
hybridisation match, a mis-match, and background. AH too ofien, 
ambiguities develop in reading sequences longer than a few hundred base 
5 pairs on account of sequence recuExences. For example, if a sequence one 
!>ase shorter than the probe recurs three times in the target, the sequencv- 
position cannot be uniquely determined. The locations of these sequence 
aix^iguities are called branch points. 

Secondary structures often develop in the target nucleic acid 
10 affecting accessibilit>' of the sequences. This could lead lo blocks of 
sequences that are unreadable if the secondary structure is more stable than 
occurs on the complementary strand: 

A final drawback is the possibility that certain probes will 
have anomalous behavior and for one reason or another, be recalcitrant to 
15 hybridi^atio^ under whatever standard sets of conditions ultimately used. 
A simple example of this is the difficulty in finding matching conditions for 
probes rich in G/C content. A more complex example could be sequences 
with a high propensity to form triple helices. The only way to rigorously 
explore these possibihties is to carry out extensive hybridization studies with 
20 all possible oligonucleotides of length "/i" under the particular format and 
conditions chosen. This is clearly impractical if many sets of conditions are 
involved. 

Among the early publication which appeared discussing 
sequencing by hybridization, E.M. Southern (WO 89/1 0977), described 
25 methods whereby unknown, or target, nucleic acids are labeled, hybridized 
to a set of nucleotides of chosen length on a solid support, and the nucleotide 



6 

sequence of the target determined, at least partially, from knowledge of the 
sequence of the bound fragments and the pattern of hybridization observed. 
Although promising, as a practical matter, this method has numerous 
drawbacks. Probes are entirely single-stranded and binding stability is 
5 dependent upon the size of the duplex However, every additional 
nucleotide of the probe necessariiy increases the size of the array by four 
fold creating a dichotomy which severely restricts its plausible use. Further, 
there is an inability to deal with branch point ambiguities or secondary 
structure of the target, and hybridization conditions will have to be tailored 
10 or in some way accounted for each binding event. Attempts have been made 
to overcome or circumvent these problems. 

R, Drmanac et al. fU.S. Patent No. 5^02,23 1 ) is directed to 
methods for sequencing by hybridizaliori using sets of oligonucleotide 
probes with random or variable sequences. Tfiese probes, although useful, 
! 5 suffer from some of the same drawbacks as the methodology of Southern 
(1989), and like Southern, fail to recognize the advantages of stacking 
interactions. 

Khrapko et aj. (FEBS Lett, 256:118-22, 1989; and i. 
DNA Sequencing and Mapping 1:357-88, 1991) attempt to address some of 

20 these problems using a technique referred to as continuous slacking 
hybridization. With continuous stacking, conceptually, the entire sequence 
of a target nucleic acid can be detennined BasicaJiy, the target is 
hybridized to an anrayof probes, again single-su-anded, denatured frotn the 
array, and the dissociation kinetics of denaturation analyzed to deienmine the 

25 target sequence. Although also promising, discrimination between matches 
and mis-matches (and simple background) is low and. further, as 



hybridization conditions are inconstant for each dupiex, discrimination 
becornes increasingly reduced with increasing target complexity. 

Another major probieni with current sequencing formats is the 
mability to efficiently detect sequence information. In conventionai 
5 procedures, individual sequences are separated by, for example, 
electrophoresis using capillaiy or slab gels. This step is slow, expensive and 
requires the talents of a number of highly trained individuals, and, more 
importantly, is prone to error One attempt to overcome these difficulties 
has been to utilize the technology of mass spectrometry. 
^ spectromeby of organic molecules was made possible 

by the development of instruments able to volatize large varieties of organic 
compound- and by the discoveov that the molecular ion formed by 
volatizat: .::. :::.reaks down into charged fragmenis whose structures can be 
related to the intact molecule. Although the process itself is relatively 
5 straight forward, actuai implementation is quite complex. Briefly, the 
sample molecule or analyte is volatized and the resulting vapor passed into 
an ion chamber where it is bombarded with electrons accelerated to a 
compatible energy IcveL Electron bombardment ionizes the molecules of 
the sample analyte and then directs the ions formed to a mass analyzer. The 
) mass analyzer, with its combination of electrical and magnetic fields, 
separates impacting ions according to their mass/charge (m/c) ratios. From 
these r^ios, the molecular weights of llie impacting ions can be determined 
and the structure and molecular weight of the analyte determined. The 
entire process requires less than about 20 microseconds, 

Auempts to apply mass spectrometry^ to the analysis of 
bioraoieciiles such as proteins and nucleic adds have been disappointing. 



8 

Mass spectrometric analysis has traditionally been limited to molecules with 
moiccuJar weights of a few thousand daitons. At higher molecular weights, 
samples become increasingly difficult to volatize and large polar molecuJes 
generaify cannot be vaporized without catastrophic consequences. The 

5 energy requirement is so significant that the molecule is destroyed or, even 
worse, fragmented. Mass spectra of fragmented molecules are often 
difficult or impossible to read. Fragment Ifaiking order, particularly useful 
for reconstructing a molecular structure, has been lost in the fragmentation 
process. Both signal to noise ratio and resohition are significantly 

0 negatively affected. In addition, and specifically with regard to 
biomolecular sequencing, extreme sensitivity is necessary to delect tlie 
single base differences betwe^ biomolecular polymers to determine 
sequence identity. 

A number of new methods have been developed based on the 

5 idea that heat, if applied with sufficient rapidity, wjil vaporize the sample 
biomolecule betbre decomposition has an opportunit>' lo take place. This 
rapid heating technique is referred to as plasma desorption and there are 
many variations. For example, one method of plasma dcsorption involves 
placing a radioactive isotope such as Califomium-252 on the surface of a 

) sample analyte which forms a blob of plasma. From this plasma, a fevy ions 
of the sample molecule will emerge intact. Field desorption ionization, 
another form of desorption, utiliizes strong eJectrostatic fields to literaHy 
exuract ions from a substrate. In secondary ionization mass spectrometry or 
fast ion bombardment, an analyte surface is bombarded with electrons which 

5 encourage the release of intact ions. Fast atom bombardment involves 
bombarding a surface with accelerated ions which are neutralized by a 



!0 



9 

charge exchange before they hit the surface. Presiioiably, neutrali^tion of 
liie charge lessens the probability of molecular desmiction, but not the 
creation of ionic forms of the sample. In la^jerdesoiption, photons coniprise 
the vehicle for depositing energy on. the surface to volatize and ionize 
molecules of tlie sample. Each of these techniques has had some measure 
of success with different types of sample molecuies. Recently, there have 
aiso been a variety of techniques and combinations of techniques 
specifically directed to the analysis of nucleic acids. 

Bremian et ah used nuclide markers to identify terminal 
nucleotides in a DNA sequence by mass spectrometry (U.S. Patent No. 
5,003,059). Stable nuclides, detectable by mass spectrometry, were placed 
in each of the four dideoxynuclcotides used as reagents to polymerize cDNA 
copies of the target DNA sequence. .Polymerized copies were separated 
eiectrophoreticaliy by size and the terminal nucleotide identified by the 
15 presence ofthe unique label. 

Fenn et a!, describes a process; for the production of a mass 
spectrum containing a multiplicity of peaks (U.S. Patent No. 5,130,538). 
Peak components comprised multiply charged ions formed by dispersing a 
solution containing an analyte into a bath gas of highly charged droplets. 
An electrostatic field charged the surface ofthe solution and dispersed the 
liquid into a spray refared to as an electrospray (ES) of charged droplets. 
This nebulization provided a high charge/mass ratio for the droplets 
increasing the upper limit of volatization. Detection was still limited to less 
than about ] 00,000 daltons. 

Jacobson et aL utilizes mass spectrometry' to analyze a DNA 
sequence b>' incorporating stable isotopes into the sequence (U,S. Patent No. 



20 



25 



10 

. 5,002,868). Incoiporation required the steps of en2yniatically introducing 
the isotope into a strand of DNA at a terminus, electrophoretically 
separating the strands to determine fragment size and analyzing the 
separated strand by mass spectrometry. Although accuracy was stated to 
5 have been increased, electrophoresis was necessary to isolate the labeled 
strand. 

Brennan also utilized stable markers to label the terminal 
nucleotides in a nucleic acid sequence, but added the step of completely 
degrading tiie components of the sample prior to analysis (U,S. Patent Nos. 

10 5,003,059 and 5,174,962). Nuclide markers, cnzyinaticaily incorporated 
into either dideoxy nucleotides or nucleic acid primers, were 
electrophoreticaily separated. Bands were collected and subjected to 
combustion and passed through a mass specU*ometer. Combustion converts 
the DNA into oxides of carbon, hydrogen, nitrogen and phosphorous, and 

15 tile label into sulfur dioxide. Labeled combustion products were identified 
and the mass ofihe initial molecule reconstructed. Although fairly accurate, 
the process does not lend itself to large scale sequencing of biopolymers. 

A recent advancement in the mass specu-ometric analysis of 
high molecular weight molecules in biology has been the development of 

10 time of night mass spectrometry H^OF-MS) wiUj matrix-assisted laser 
desorpiion ionization (MALDI). This process involves placing the sample 
into a matrix which contains molecules which assist in die de&orption 
process by absorbing energy at the frequency used to desorp the sample. 
The theorv^ is that volatization of the matrix molecules encourages 

i5 volatization of the sample without significant destruction. Time of flight 
analysis utilises the Uavel time or flight time of the various ionic species as 



IJ 

an accurate indicator of molecular mass. There have been some notabk 
successes with these techniques. 

Bcavis et aL proposed to measure the molecular weights of 
DNA fragments in mixtures prepared by either Maxam-Gilbert or Sanger 
sequencing techniques (U.S. Patent No. 5,288,644), Each of the different 
DNA fragments to be generated would have a common origin and terminate 
at a particular base along an unknown sequence. The separate mixtures 
woufd be analyzed by laser d^orption time of flight mass spectroscopy to 
determine fragment molecular weights. Spectra obtained from each reaction 
would be compared using computer algorithms to detcmiine the location of 
each of the foi^r bases and ultimately, the sequence of die fragment. 

Williams et al. utilized a combination of pulsed laser Elation, 
muiliphoton ionization and time of flight mass spectrometo'. Effective laser 
desorption was accomplished by ablating a frozen film of a solution 
containing sample molecules. When ablated, the film produces an 
expanding vapor plume which entrains tiie intact molecules for analysis by 
mass spectrometry. 

Even mcM-e recent developments in mass spectrometry have 
further increased the upper limits of molecular weight detection and 
detemiination. Mass spectrograph systems with reflectors in the flight mbe 
have effectively doubled resolution. Reflectors also compensate for errors 
in mass caused by the fact that the ionized/accelerated region of the 
instrument is not a point source, but an area of finite size wherein ions can 
accelerate at any point. Spatial differences bet^^'een particle the origination 
points of the particles, problematic in conventional instruments because 
arrival times at the detector will var>'. are. overcome. Particles that spend 



12 



more time in the accelerating field will also spend more time in the retarding field- 
Therefore, particles emerging from the reflector are mostly synchronous, vastly 
improving resolution. 

Despite these advances, it is still not possible to generate coordinated spectra 
representing a continuous sequence. Furthermore, throughput is sufficiently slow so as to 
make these methods impractical for large scale analysis of sequence information. 

Summary of the In ventioa 

The prraseiat invention overcomes the problems and disadvantages associated with 
current strategies and designs and provides methods, kits and apparatus for determining 
the sequence of target nucleic acids. 

Described herein are meUiods for sequencing a target nucleic acid. A set of nucleic 
acid fragments^ containing a sequence which is complementary or homologous to a 
sequence of the target is hybridized to an array of nucleic acid probes wherein each probe 
comprises a double-stranded portion, a single-stranded portion and a variable sequence 
within said single-stranded portion, forming a target array of nucleic acids. Molecular 
weights for a plurality of nucJeic acids of the target array are determined and the sequence 
of the target constructed. Nucleic acids of the target, the target sequence^ the set and the 
probes may be DNA, RNA or PNA comprising purine, pyrimidine or modified bases. The 
probes may be fixed to a sohd support such as a hybridization chip to facilitate automated 
determination of molecular weights and identification of the target sequence. 

There are also described herein fur&er methods for sequencing a target nucleic 
acid. A set of nucleic acid fragments containing a sequraice which is complementary or 
homologous to a sequence of the <arget is hybridized to an array of nucleic acid probes 
forming a target array containing a plurality of nucleic acid complexes. One 
strand of those probes hybridized by a fragment is extended using the 
fragment as a template. Molecular weights for a plurality of nucleic acids of tlie target 
array are determined and the sequence of the target constructed^ 
Strands can be en^yinatically extended using chain terminating and chain 
elongating nucleotides. The resulting nested set of nucleic acids represents the sequence 
of the target. 

There are also described herein methods for detecting a target nucleic acid. A set of 

nucleic acids complementary to a sequence of the target, is hybridized to a fixed array of 
ucleic acid probes. The molecular weights of the hybridized nucleic acids are 
etermined by mass spectrometry and a sequence of the target can be identified. Target 
nucleic acids may be obtained from biological samples such as patient samples 

f I:\r>avUb\LBZZ10545<j.doC mrr 



13 



wherein detection of the target is indicative of a disorder m the patient, such as a genetic 

defect, a neoplasm or an infection. 

: There are also described herein additional methods for sequencing a target nucleic 

acid. A sei|uence of the target is cleaved into nucleic acid fragments atid the fragments 
5 hybridized to an array of nucleic acid probes. Fragments are created by enzymattcaUy or 

physicaHy cleaving the target and the sequence of the fragments is homologous with or 

conjplementary to at least a portion of the target sequence. The array is attached to a solid 
. support and the molecular weights of the hybridized; fragments dctennined by mass 

spectrometry. From the molecular weights determined, nucleotide sequences of the 
10 hybridized fragments are determined and a nucleotide sequence of the target can be 

identified. 

Further methods for sequencing a target nucleic acid are also described herein. A 
set of nucleic acids complementary to a sequesice of the target is hybridized to an array of 
single*stranded nucleic acid probes wherein each probe comprises a constant sequence 
15 and a variable sequence and said variable sequence is determmable^ The molecular 
weights of the hybridized nucleic acids are determined and the sequence of said target 
identified. The array comprises less than or equal to about 4^ different probes and R is the 
length m nucleotides of the variable sequence and maybe attached to a solid sui^ort. 

There are also described herein methods for sequencing a target nucleic acid by 
20 strand-displacement, double-stranded sequencing. A set of partially single-stranded and 
partially double-stranded nucleic acid fragments are provided wherein each fragment 
contains a sequence that corresponds to a sequence of the target. These nucleic acid 
fragments arc hybridized to a set of partially single-strajidcd and pardaliy double-stranded 
nucleic acid probes, via the s^ngie-strandcd regions of each, to form a set of 
t$ firagment/probe complexes. Prior hybridization, cither the firagmcnts or the probes may 
be treated with a phosphorylase lo remove phosphate groups from the S'-termhii of the 
nucleic acids. 5 -termini are Hgated with adjacent 3 -termini of the complex J6bniiing a 
common single strand. The complementary unligated strand contains a nick which is 
recognized, by a nucleic acid polymerase that initiates, strand-displacement 
50 polymerization, extending the unligated strand. Polymerization proceeds, using the 
ligated strand as a template, in the presence of labeled nucleotides such as mass modified 
nucleotides. The sequence of the target can be determined by mass spectrometry from the 
^lecular weights of the extended strands. This process can be used to sequence target 
Bic acids and also to identify a single sequence in a mixed background. Selection of 
^species of nucleic acid to be sequenced occurs upon hybridization to the probe. As 

II:NDaiyljb\LIBZZ105456.rfoC:jnrr 




BNSDOCiD: <AU 



76S464a2.J.> 



14 



only fragments complementary to the single-stranded region of the probe will form 
complexes, only those fragments complexes are sequeiiccd. 

There arc also described herein arrays of nucleic acid probes. In these arrays, each 
probe comprises a first strand and a second strand whcrem the first strand is hybridized to 
the second strand forming a double-stranded portion^ a single -stranded portion and a 
variable sequence within the singic-stranded portion. The array may be attached to a solid 
support such as a material that facilitates volatization of nucleic acids for mass 
spectrometry. Arrays can be fixed to hybridization chips containing less than or equal to 
about 4*^ di fferent probes wherein R is the length in nucleotides of the variable sequence. 
Arrays can be used in detection methods and in kits to detect nucleic acid sequences 
which may be indicative of a disorder and in sequencing systems such as sequencing by 
mass spectrometry. 

Also described herein are arrays of sin^e-stranded nucleic acid probes wherein 
each probe of Ihe array comprises a constant sequence and a variable sequence which is 
determinable. Arrays may be attached to solid supports which comprise matrices that 
facilitate volatization of nucleic acids for mass spectrometry^ Arrays, generated by 
conventional processes, may be characterized using the above methods and r^iicated in 
mass for use in nucleic acid detection and sequencing systems. 

There arc also described herein kits for delecting a sequence of a target nucleic acid. 
Kits contain arrays of nucleic acid probes fixed to a solid support wherein each probe 
comprises a double-stranded portion, a singic-stranded portion and a variable sequence 
within said single-stranded portion. The solid support may be, for example* coated with a 
matrix that facilitates volatiziation of nucleic acids for mass spectrometry such as an 
aqueous composition. 

The present application also describes mass spectrometry systems for the rapid 
sequencing of nucleic acids. Systems comprise a mass spectrometer^ a computer with 
appropriate software and probe arrays which can be used to capture and sort nucleic acid 
sequences for subsequent analysis by mass spectrometry. 

Accordingly, in a first embodiment of the invention there is provided a method for 
sequencing a target nucleic acid, comprising the steps of: 
(a) providing 

(i) a set of nucleic acid fragments^ wherein each fragment contains a 
sequence that corresponds to a sequence of the target nucleic acid, and 

(ii) an array of nucleic acid probes, wherein each probe comprises a 
;mgle-strand^ portion comprisirig a van able region; 

I I:\DaYUb\LIBZZ|0S'<t56.dw;:miT 



15 




(b) hybridizing the set of nucleic acid fragmeiits to the array of nucleic acid 
probes to form a target array of nucleic aci<ls; and 

(c) determining molecukT weighfe of nucleic acids in the target array to 
identify hybrids aad thereby -determine the sequence of the target nucleic acid. 

5 According to a second emfoodiment of the invention theie is provided a method for 

sequencing a target nucleic acid, comprising the steps of: 

(a) providing 

(i) a set of nucleic acid fragments, wherein each fragment contains a 
sequence that corresponds to a sequence of the target nucleic acid, and 
10 (ij) an array of nucleic acid probes, wherdn each probe comprises a 

sing)e-stranded portion comprising a variable region; 

(b) hybridising the set of nucleic acid fragments to the array of nucleic acid 
probes to form a target array of nucleic acids; 

(c) enzj-matically extending the nucleic acid probes of the target array using 
15 the hybridized isirget mchic acid as a template to form extended strands; and 

(d) determining molecular weights of the extended strands, whereby the 
sequence of the target nucleic acid is determined. 

According to a third embodiment of the invention there is provided a method of 
detecting a target nucleic acid, comprising the steps of: 
20 (a) providing 

(i) a set of nucleic acid fragments, wherein each fragment contains a 
sequence that corresponds to a sequence of the target nucleic acid, and 

(ii) an array of nucleic acid probes, wherein each probe comprises a 
single-stranded portion comprising a variable region; 

IS (b) hybridizing the set of nucleic acid fragments to the array of nucleic acid 

probes to form a target array of nucleic acids^ and 

(c) determining molecular weights for nucleic acids of the target array* 
whereby the target nucleic acid is detected. 

According to a fourth embodiment of the invention there is provided a method for 
30 sequencing a target nucleic acid, comprising the steps of: 
(a) providing 

(i) a set of partially single-branded nucleic acid Iragments, wherein 
each fragment contains a sequence Uiat corresponds to a sequence of the target nucleic 
Lcid, and 



BNSDQCID: <AU., 



.75&454B2_L> 



16 



(ii) an array of nucietc acid probes, wherein each probe comprises a 
single-stranded portion comprising a variable region and a double-stranded portion; 

(b) hybridizing the single-stranded portions of the fragments to single- 
stranded portions of the array of nucleic acid probes; 

(c) tigating single strands of the fragments to adjacent single strands of the 

prd>es; 

(d) extending the unligated strands using the li gated strand as a template; 

and 

(e) determining the molecular weights of the extended strands, whereby the 
sequence of the target nucleic acid is determined. 

According to a fifth embodiment of the invention there is provided a method for 
identifying a target nucleic acid sequence in a mixture contattting a phjrality of different 
nucleic acid sequences, comprising the steps of: 

(a) treating the nucleic acids to create partially single-stranded* partially 
double-stranded nucleic acid fragments; 

(b) hybridizing the single-stranded portions of the fragments to single- 
stranded portions of probes comprising a single-stranded portion comprising a variable 
region, and a partially double-stranded portion; 

(c) ligating single strands of the togments to adjacent single strands of the 



20 probes; 



(d) 

(e) 
(f) 




extending the unligated sirands rising the ligated strand as a template; 
determining the molecular weights of the extended strands; and 
identifying a target nucleic acid sequence by the molecular weigjit of 
the extended strands. 

Other embodiments and advantages of the invention are set forth, in part, in the 
description which follows and, in part, will be obvious from this description and may be 
learned from the practice o f the invention. 

Description of the Drawings 
Figure 1 {A) Schematic of a mass modi^ed nucleic acid primer; and 

(B) primer mass modification moieties. 
Figure 2 (A) Schematic of mass modified nucleoside triphosphate elongators and 
Terminators; and 

(B) nil cleoside triphosphate mass modi ficati on moieties . 
:e3 List of mass modification moieties. 



ll;\DayLib\LBZZ]0JS4^xJoc™ 



^SDOCiDr <AU_ 



_.75e454B2_L> 



17 

Figure 4 List of mass modification moieties. 

Figure 5 Cleavage site otMwo I indicating bidirectional sequencing. 
Figure 6 Schematic of sequencing strategy after target DNA digestion 
byrjrpRI. 

5 Figure? Calculated T„ of matched and mismatched compJementary 
DNA. 

Figures Replication of a master array. 

Figure 9 Reaction scheme for the covalent attachment of DNA f, a 
surface. 

10 Figure 10 Target nucleic acid capture and ligation. 

Figure 1 1 Ligation eff1ci«,cy of matches as compared lo mismatches. 
Figure 12 (A) LigaUon of target DNA with probe attached at 5'- 

terminu.; and (B) ligation of target DNA with probe attached 

at the 3'-temiinus. \ 

15 Figure 13 Gei reader sequencing results from primer hybridization 
aiEalysi.s. 

Figure 14 Mass spectrometry of oligonucleotide ladder. 

Figure 15 Schematic of mass modification by alkylation. 

Figure 16 Mass spcctmm of I7-mer target with 0, i or 2 mass modified 
moieties. 

Figure 17 : Schematic of nicked strand displacement sequencing with 

immobilized template. 
Figure 1 8 Analysis of sequencing reaction in the presence and absence 
of single-stranded DNA binding protein. 
25 Figure 19 Schematic of nicked strand displacement sequencing with 
immobilized probe. 



18 

Results of sequencing performed using DF27- i as a probe. 
Results of sequencing performed using DF27-2 as a probe. 
Results of sequencing performed using DF27-4 as a probe. 
Results of sequencing performed using DF27-5-CY5 as a 
probe. 

Results of sequencing performed using DF27-6-CY5 as a 
probe. 

Description of the Invention 

As embodied and broadly described herein, the present 
invention is directed to methods for sequencing a nucleic acid, probe arrays 
usefiji for sequencing by mass spectrometry and kits and systems which 
comprise these arrays. 

Nucleic acid sequencings on both a large and small scale, is 
crhical to many aspects of medicine and biology such as, for example, in the 
identification, analysis or diagnosis of diseases and disorders, and in 
determining relationships between living organisms. Conventional 
sequencing techniques rely on a base-by-base identification of the sequence 
using eiectrophorests in a semi-solid such as an agarose or polyacrylamide 
gel to determine sequence identity* Although attempts have been made to 
apply mass spectrometric analysis to these methods, the two processes are 
not well suited because, at least in part, information is stiU be gathered in a 
single base format. Sequencing-by-hybrid izat Ion methodology has 
enhanced the sequencing process and provided a more optimistic outlook for 
more rapid sequencing techniques, however, this methodology is no more 
applicable to mass spectrometry than traditional sequencing techniques. 



Figure 20 
Figure 21 
Figure 22 
Figure 23 

Figure 24 



19 

In contrast, positional sequencing by hybridization (PSBH) 
with its ability to stably bind and discriminate difTerent sequences with large 
or small arrays of probes is well suited to mass spectrometric analysis. 
Sequent^ information is rapidly detennined in batches and with, a minimum 
of effort, Such processes can be used for both sequencing miknown nucleic 
acids and for detecting known sequences whose presence may be an 
indicators of a disease or contamination. Additionally, these processes can 
be utilized to create coordinated patterns of probe arrays with known 
sequences. Determination of the sequence of fragments hybridized to the 
probes also reveals the sequence of the probe. These pro(^ses are currently 
not possible with conventional techniques and, further, a coordinated batch- 
type analysis provides a significant increase in sequencing speed and 
accuracy which is expected lo be required for effective large scale 
sequencing operations, 

PSBH is also well suited to nucieic acid analysis wherein 
sequence information is riot obtained directly from hybridization. Sequence 
information can be learned by coupling PSBH with teciiniques such as mass 
spectrometry. Target nucieic acid sequences can be hybridized to probes or 
array of probes as a method of sorting nucleic acids having distinct 
sequences without having a prion knowledge of the sequences of tiie 
various hybridization events. As each probe will be represented as multiple 
copies, it is only necessan^ that hyiDridization has occun-ed to isolate distinct 
sequence packages, in addhion, as distinct packages ofsequences, they can 
be ampHfled, modified or otherwise controHed for subsequent analysis. 
Amplification increases the number of specific sequences which assists in 
any analysis requiring increased quantities of nucleic acid while retaining 



20 

sequence specificit>'. Modification may involve chemically altering the 
nucleic acid molecule to assist with later or downstream analysis. 

CcHisequently, another important feature of the invention is tlie 
ability to simply and rapidly mass modify the sequences of interest, A mass 
modification is an alteration in the mass, typically measured in terms of 
molecular vvcight as daltons, of a molecule. Mass modification which 
increase the discrimination between at least two nucleic acids with single 
base differences in size or sequence can be used to facilitate sequencing 
using^ for example, molecular weight determinations. 

One embodiment of the invention is directed to a method for 
sequencing a target nucleic acid using mass modified nucleic acids and mass 
spectrometry technolog>\ Target nucleic acids which can be sequenced 
include sequences of deoxynbonucleic acid (DNA) or ribonucleic acid- 
(RNA), Such sequences may be obtained from biological, recombinant or 
other man-made sources, or purified from a natural source such as a patient's 
tissue or obtained from environmental sources. Alternate types of molecules 
which can be sequenced includes polyamide nucleic acid (PNA) {P.E. 
Nielsen et al., Sci, 254: 1497-1500, 1991) or any sequence of bases joined 
by a chemical backbone that have the abilit>^ to base pair or hybridize with 
a complementary chemical structure. 

The bases of DNA, RNA and PNA include purines, 
pyrimidines and purine and pyrimidine derivatives and modifications, which 
are linearly linked to a chemical backbone. Common chemical backbone 
structures are deoxyribose phosphate, ribose phosphate, and polyamide. The 
purines of both DNA and RNA are adenine (A) and guanine (G)/ Others 
that are known to exist include xanthine, hypoxanthine. 2- and U 



21 

diaminopurine, and other more modified bases, The pyrimidines are 
cytosine (C), which is common I o both DNA and RNA, uracil (U) found 
predominantly in RNA, and thymidine (T) which occurs aimost exclusively 
in DNA. Some of the more atypical piriraidines include methyicytosine, 
hydroxymethyl-cytosine, methyl uracil, hydroxymethyluracil, 
dihydroxypentyiuraciL and other base modifications. These bases interaci 
in a complementary fa^ion to form base-pairs, such as, for example, 
guanine with cytosine and adenine vntti thymidine. This invention also 
encompasses situaiions in which there is non-traditional base pairing such 
as Hoogsteen base pairing which has been identified in certain tRNA 
. molecules and postulated to exist in a triple helix. ^ 

Sequencing involves providing a nucleic acid sequence which 
is homologous or complementary^ to a sequence of the target. Sequences 
may be chemically synthesized using, for example, phosphoranitsdite 
chemistr>' or created.enzymaticaliy by incubating the target in an appropriate 
buffer with chain elongating nucleotides and a nucleic acid polymerase. 
Initiation and lermmation sites can be controlled with dideoxynucleotides 
or oHgonucIeotide primers^ or by placing coded signals directly into the 
nucleic acids. The sequence created may comprise any portion of the target 
sequence or the entire sequence. Alternatively, sequencing may involve 
elongating DNA in the presence of boron derivatives of nucleotide 
triphosphates. Resulting double-stranded samples are treated with a 3' 
exonuclease such as exonuc lease III. This exonuclease stops when it 
encounters a boronated residue thereby creating a iscquenctng ladder. 

Nucleic acids can also be purified, if necessary* to remove 
substances which could.be harmful (e.g. toxins), daneerous (e.g, infectious) 



22 



or might interfere with the hybridization reaction or the sensitivity of that 
reaction (e.g. metals, salts, protein, lipids). Purification may involve 
techniques such as chemical extraction with saits, chloroform or phenol, 
sedimentation centrifugation, chromatography or other techniques known 
5 to those of ordinary skill in the art. 

If sufScient quantities of target nucleic acid are available and 
the nucleic acids are sufficiently pure or can be purified so that any 
substances which would interfere with hybridization are removed, a plurality 
of target nucleic acids may be directly hybridized to the array. Sequence 
1 0 information can be obtained without creating complementary or homologous 
copies of a tar get sequence . 

Sequences may also be ampJified, if necessary or desired, to 
increase the number of copies of the target sequence using, for example, 
polymerase chain reactions (PCR) technology or any of the amplification 
15 procedures. Amplification involves denaturation of template DNA by 
heating in the presence of a large molar excess of each of two or more 
oligonucleotide primers and four dNTPs (d<3TP, dCTP, dATP, dTTP). The 
reaction mixmre is cooled to a temperature that allows the oligonucleotide 
primer to anneal to target sequences, after which the annealed primers are 
20 extended with DNA polymerase. The cycle of denaturation, annealing, and 
DNA synthesis, the principal of PCR amplification, is repeated many times 
to generate large quantities of product which can be easily identified. 

The major product of this exponential reaction is a segment of 
double stranded DNA whose termini are defined by the 5' termini of the 
25 ohgonucleoiide primers and whose length is defined by the distance between • 
the primers. Under normal reaction conditions, the amoum of polymerase 



23 

becomes Limiting after 25 to 30 cycles or about one million fold 
amplification. Further, amplification is aciiieved by diluting the sample 
lOOO fold and using ft as the template for further rounds of ampiiftcation in 
another PCR, By this method, amplification levels of 10* to 10^^ can be 
achieved daring the course of 60 sequential cycles. This allows for the 
detection of a single copy of the target sequence in the presence of 
contaminating DNA, for example, by hybridization with a radioactive prd^e. 
With the use of sequential PGR, the practical detection limit of PGR can be 
as low as 10 copies of DNA per sample. 

Although PGR is a reUable method for amplification of target 
sequences, a number of other techniques can be used such as Ugase chain 
reaction, self sustained sequence replication, QP replicase ampiification, 
polymerase chain reaction linked ligase chain reaction, gapped iigase chain 
reaction, ligase chain detection and strand c-; . ;::>u;emcnt amplification. The 
principle of iigase chain reaciiion is base; - r^art on the ligation of two 
adjacent synthetic oligonucieotjde primers : :o uniquely hybridize to one 
strand of the target DNA or RNA. It uie target is present, the rwo 
oiigonucJeotides can be covalently linked by Ugase. A second pair of 
primers, almost entirely complementary to the first pair of primers is also 
provided The template and the four primers. are placed into a thermocycler 
with a thermostable ligase. As the temperature is raised and lowered, 
oligonucleotides are renatured immediately adjacent to each other on the 
template and ligated. The ligated product of one reaction serves as the 
template for a subsequent round of ligation. The presence of target is 
manifested as a DNA fragment with a length equal to the sum of the two 
adjacent oligonucieolides. 



24 

Target sequences are fragmented, if necessarj', into a plwality 
of fragments using physical, chemical or enzymatic means to create a set of 
fragments of unifom or relatively unifotm length. Preferably, the 
sequences are enzymatically cleaved using nucJeases such as DNases or 
5 RNases (mung bean nuclease, micrococcal nuclease, DNase I, RJvlase A, 
RNase Tl), type I or II restriction endonucleases, or other site-specific or 
non-specific endonucleases. Sizes of nucleic acid fragments are between 
about 5 to about 1,000 nucleotides in length, preferably between about 10 
to about 200 nucleotides in length, and more preferably between about 12 
10 to about 100 nucleotides in length. Sizes in the range of about 5, 10, 12, 15, 
1 8, 20, 24, 26, 30 and 35 are useful to perform small scale analysis of short 
regions of a nucleic acid target. Fragment sizes in the range of 25, 50, 75, 
125, 150, 175, 200 and 250 nucleotides and larger are useful for rapidly 
analyzing larger target sequences. 

Target sequences may also be enzj'maticaliy synthesized 
using, for example, a nucietc acid polymerase and a coUection of chain 
elongating nucleotides (NTPs, dNTPs) and limiting amounts of chain 
terminating (ddNTTPs) nucleotides. This type of poiymerization reaction can 
be controlled by varying the concentration of chain terminating nucleotides 
20 to create sets, for example nested sets, which span various size ranges. In 
a nested set, fragments wiH have common one terminus and one terminus 
which will be different between the members of the set such that the larger 
fragments will contain the sequences of the smaller fragments. 

The set of fragments created, which may be either homologous 
25 or compiementary to the target sequence, is hybridized lo an array of nucleic 
acid probes forming a target array of nucleic acid probe/fragmcni 



25 

complexes. An array constitutes an ordered or structured plurality' of nucleic 
acids which may be fixed to a solid support or in liquid suspension. 
H>i)ridizatioii of the fragments to the array allows for sorting of very large 
collections of nucleic acid fragments into identifiable groups. Sorting does. 
5 not require a priori knowledge of the sequences of the probes, and can 
greatly facilitate analysts by, for example, mass spectrophotometric 
techniques. 

. Hybndization between complementary bases of DN A, RN A, 
PNA, or combinations of DNA, RNA and PNA, occurs under a v^ide variety 
10 of conditions such as variations in temperature, salt concentration, 
electrostatic strengtii, and buffer composition. Examples of tnes^^ cpnditions 
and methods for applying them are described in Nucleic Acid Hy^y^dization: 
A Practical Approach (B.D. Hames and SJ. Higgins, editors, IRX Press, 
19g$). It is preferred that hybridization takes place betw^een about O^C and 

15 about 70 ^C, for periods of from about one minute to about one hour, 
. depending on the nature of the sequence to be hybridized and its length. 
However, it is recognized that hybridizations can occur in seconds or hours, 
depending on the conditions of the reaction. For example, typical 
hybridization conditions for a mixmre of two 20-mers is to bring the mtxuire 

20 to 68 ''C and let cool lo room temperature (22*C) for five minutes or at very 
low temperatures such as 2°C in 2 microliters. Hybridization between 
nucleic acids may be facilitated using buffers such as Tris-EDTA (TB), Tris- 
HCl and HEPES, salt solutions {e.g, NaCi, KCl, CaCK), other aqueous 
solutions, reagents and chemicals. Examples of ihese reagents include- 

25 single-slranded binding proteins such as Rec A protein. T4 gene 32 protein, 
£. coll single-stranded binding protein and major or minor nucleic acid 



26 

groove binding proteins. Exani^les of other reagents and chemicals include 
divalent ions, polyvalent ions and intercalating substances such as ethidium 
bromide, actinomycin D, psoralen and angelicin. 

OpttonalJy, hybridized target sequences may be ligated to a 
5 single-strand of the p-obes thereby creating ligated target-^probe complexes 
or ligated target arrays. Ligation of target nucleic acid to probe increases 
fidelity of hybridization and allows for incorrectly hybridized target to be 
easily washed from correctly, hybridized target. More importantly, the 
addition of a ligation step allows for hybridizations to be perfomied under 
10 a single set of hybridization conditions. Variation of hybridization 
conditions due to base composition are no longer relevant as nucleic acids 
v^ith high A/T or G/C content ligate with equal efficiency. Consequently, 
discrimination is very high between matches and mis-matches, much higher 
than has been achieved using other methodologies wherein the effects of 

15 G/C content were only somewhat neutralized in high concentrations of 
quaicmar>^ or terUary amines such as, for example, 3M tetramethyl 
ammoniiim chloride. Further, hybridization conditions such as temperatures 
of bctft'een about 22 "^C to about ^T^'C, sah concentrations of between aboui 
0,05 M to about 0.5 M, and hybridization times of between about less than 

20 one hour to about 14 hours (overnight), are also suitable for ligation. 
Ligation reactions can be accomplished using a eukafydtic derived or a 
prokaiyotic derived Hgase such as T4 DNA or RNA ligase. Methods for use 
of these and other nucleic acid modifying enzvmes are described in Currenf 
Froiocols in Molecular Biology (F.M. Ausubel ei ai., editors, Jolin Wiley & 

25 Sons. 1989). 



27 

Each probe of the probe airay comprises a singie-stranded 
porfion, an optional double-stranded portion and a vaiiable sequence within 
the single-stranded portion. These probes may be DNA, RNA, PNA, or any 
combination thereof, and may be derived from natural sources or 
recQmbinant sources, or be organically synthesized. Preferably, each probe 
has one or more double stranded portions which are about 4 to about 30 
nucleotides in length, preferably about 5 to about 1 5 nucleotides and more 
preferably about 7 to about 12 nucleotides, and may also be identical within 
. the various probes of the airay, one or more single stranded portions which 
are about 4 to 20 nucleotides in length, preferably between about 5 to about 
12 nucleotides and more preferably between about 6 to about 1 0 nucleotides, 
and a variable sequence within the single stranded portion which is about 4 
to 20 nucleotides in length and preferably about 4, 5, 6, 7 or 8 nucleotides 
in length. Overall probe sizes may range from as small as S nucleotides in 
lengths to 100 nucleotides and above. Preferably, sizes are from about 12 
to about 35 nucleotides, and more preferably, from about 12 to about 25 
nucleotides in length. 

Probe sequences may be partly or entirely known, 
determinable or completely unknown. Known sequences can be created, for 
example, by chemically synthesizing individual probes with a specified 
sequence at each region. Probes with determinable variable regions may be 
chemically syn?i^esi^ed with random sequences • and the sequence 
information determined separately. Either or both ihe single-stranded and 
the double-stranded regions may comprise constant sequences such as, for 
example, when an area of die probe or hybridized nucleic acid would benefit 



2g 

from having a constant sequence as a point of reference in subsequent 
analyses. 

An advantage of this t>'pe of probe is m its structure. 
Hybridization of the target nucleic acid is encouraged due to the favorable 
thermodynamic conditions, including base-staddng interactions, established 
by the presence of the adjacent double strandedness of the pmbe. Probes 
may be structured with terminal single-stranded regions which consist 
entirely or partly of variable sequences, internal single-stranded regions 
wfuch contain both constant and variable regions, or combinations of these 
structures. Preferably, the probe has a singlc-stranded region at one 
terminus and a double-stranded region at the opposite terminus. 

Fragmented targei sequences, preferably, will have a 
distribution of terminal sequences sufficiently broad so that the nucleotide 
se4:juence of the hybridized ^agments will include the entire sequence of tiie 
target nucleic acid. Consequently, the typical probe array will comprise a 
collection of probes with sufficient sequence diversity in the variable 
regions to hybridize, with complete or nearly complete discrimination, all 
of the target sequence of the target-derived sequences. The resulting target 
array will comprise the entire target sequence on strands of hybridized 
probes. By way of example only, if the variable portion consisted of a four 
nucleotide sequence (R-4) of adenine, guanine, thymine, and cytosine, the 
total number of possible combinations (4^) would be 4^ or 256 different 
nucleic acid probes. If the number of nucleotides in the variable sequence 
was five, tlie number of differenl probes within the set would be 4^ or 1 ,024. 
in addition, it is also possible lo utilize probes wherein the variable 
nucleotide sequence contains gapped segments, or positions along the 



29 

variable sequence which will base pair with any nucleotide or at least not 
interfere with. adjacent base pairing. 

A nucleic acid strand of the target array may be extended or 
elongated cnzymatically. Either the hybridized fragment or one or the other 
of the probe strands can be extended. Extension reacticms can utilize various 
regions of the target airay as a template. For example, when fragment 
sequences are longer than the hybridizable portion of a probe having a 3* 
single-stranded terminus, the probe will have a 3' overhang and a 5' 
overhang after hybridization of the fragment, llie now internal 3* terminus 
of the one strand of the probe can be used as a primer to prime an extension 
reaction using, for example, an appropriate nucJeic acid polymerase and 
chain elongating nucleotides. The extended strand of the probe will contain 
sequence information of the entire hybridized fragment. Reaction mixtures 
containing dideoxynucleotides will create a set of exiendcd strands of 
5 varying lengths and, preferably, a nested set of strands. As the fragments 
have been initially sorted by hybridization lo the array, each probe of the 
array wiil contain sets of nucleic acids thai represent each segment of the 
target sequence. Base sequence information can be determined from each 
extended probe. Compilation of the sequence information from the array, 
which may require computer assistance with very targe arrays, will allow 
one to determine the sequence of the target. Depending on the structure of 
the probe (e.g. 5* overhang. 3* overhang, iniemai single-stranded region), 
strands of the probe or strands of hybridized nucleic acid containing target 
sequence can also be enzymatically amplified by, for example, single primer 
PGR reactions. Variations of this process may involve aspects of strand 
displacement amplification, Qp replicase amplification, self-sustained 



30 

sequence replication amplification and any of die various polymerase chain 
reaction amplification technologies. 

Extended nucleic acid strands of the probe can be mass 
modified using a variety of techniques and methodologies. The most 
straight forward may be to enzymatically synthesize Ihc extension utilizing 
a polymerase and nucleotide reagents, such as mass modifEed chain 
eiongattng and chain terminating nucleotides, iwdass modified nucleotides 
incorporate into the growing nucleic acid chain. Mass modifications may 
be introduced in most sites of the macromolccule which do not interfere 
with the hydrogen bonds required for base pair formation during nucleic 
acid hybridization. Typical modifications include modification of the 
heterocycUc bases, modifications of the sugar moiety (ribose or 
deoxyribose), and modifications of the phosphate group. SpecificaJly, a 
modifying functionality, which may be a chemical moiety, is placed at or 
covalemly coupled to the C2, M3, N7 or NS positions of purines, or the N7 
or N9 positions of dcazapurines. Modifications may also be placed ai the 
C5 or C6 positions of pyiimidin^ (e,g. Figures lA, IB, 2A and 2B). 
Examples of useful modifying groups include deuterium, F, CI, Br, I, biotin, 
fluorescein, iododicarbocyanine dye, SiR, Si{CHj)3, Si(CHj)2(C3Fl5), 
Si(CH,),{C,Hs),, Si(CHXCHi„^i(CH), (PH ) CH , ^ (CM )3NR, 
CH2CONR, (CH2)nOH. CH,F, CHF2 and CF^; wherein n is an integer and R 
is selected from the group consisting of -H, deuterium and alkyls, alkoxys 
and ar\']s of 1-6 carbon atoms, polyoxymethylene, monoalkylated 
polyoxymelhylene, polyethylene imine; polyamtde, polyester, alkylated 
siiyi, heiero-oligo/poJyaminoacid and polyethylene glycol (Figures 3 and 4). 



31 



Mass modifying functionalities may also be generated from 
a precursor functionality such as -Nj or -XR, wherein X is: -OH, ^NH^, - 
NHR. -SH, -NCS, •0CO(CH^)„COOH, "NHCO(CH^„COOa -OSO,OH, 
^OCO(CHJJ or -0P(0-aIkyl)-N<aikyI)2, and n is an integer from i to 20; 
5 and R is: -H, deuterium and alkyls, alkoxys or aryls of 1-6 carbon atoms, 
such as methyl, ethyl, propyl, isopropyl, t-butyl, hexyl, benzyl, benzhydral, 
trityl, substituted trityl, aiy], substituted aryl, polyoxymethyiene, 
monoalkylated polyoxymethyJene, polyethylene imine, polyamide, 
polyester, alkylated siiyl, heterooHgo/poIyaminoacid or polyethylene glycc^. 
iO These and other mass modifying functionalities which do not interfere with 
hybridization can be attached to a nucleic acids either aione or in 
combination. Preferably, combinadons of different mass modifications are 
ulihzed to maximize distinctions between nucleic acids having different 
sequences. 

15 Mass modifications may be major changes of molecufar 

weight, such as occurs with coupling between a nucleic acid and a 
hetcrooligo/polyaminoacid, or more minor such as occuns by substituting 
chemical moieties into the nucleic acid having moJecular masses smaller 
than tlie natural moiety. Non-essential chemical groups may be eliminated 

20 or modified using, for example, an alkylating agent such as iodi^cetamtde. 
Alkylation of nucleic acids with iodoaceiamide has an additional advantage 
that a reactive oxygen of the 3'-position of the sugar is eliminated. This 
provides one less site per base for alkali cations, such as sodium, to interact. 
Sodium, present in nearly a!) nucleic acids, increases the likelihood of 

25 fomiing satellite adduct peaks ypon ionization. Adduct peaks appear at a 
sligluly greater mass than ihe true molecule which would greatly reduce the 



32 

accuracy of molccHilar weight determinations. These problems can be 
addressed, in part, with nrnlnx selection in mass spectrometric analysis, but 
this only helps with nucleic acids of less than 20 nucleotides. Ammonium 
C-NHj), which can substitute for the sodium cation CNa) during ion 
5 exchange, does not increase adduct fomiation. Consequently, another useful 
mass modification is to remove alkali cations from the entire nucleic acid. 
This can be accomplished by ion exchange with aqueous solutions of 
ammonium such as ammonium acetate, ^nmonium carbonate, diammonium 
hydrogen citrate, ammonium tartrate and combinations of these solutions. 
10 DNA dissoived in 3 M aqueous ammonium hydroxide neutralizes all the 
acidic functions of the molecule. As there are no protons, there is a 
significant reduction in fragmentation during procedures such as mass 
spectrometry. 

Another mass modification is to utilize nucleic acids with non- 
15 ionic polar phosphate backbones (e.g, PNA). Such nucleotides can be 
generated by oligonuclecaide phosphomonochioaie diesteis or by enzymatic 
synthesis using nucleic acid polymerases and alpha- (a-) thio nucleoside 
triphosphate and subsequent alkylation with iodoacetamide. Synthesis of 
such compounds is straight forward and can be performed and the products 
20 separated and isolated by, for example, analytical HPLC. 

Mass moditication of arrays can be performed before or after 
i^gcx hybridization as the modification do not interfere with hybndizaiion 
of or h>'bridi^ed nucleic. ITiis conditioning of the array is simply to perform 
and easily adaptable in bulk. Probe arrays can therefore be synthesized with 
25 no special manipulations. Only after the arrays are fixed to solid supports, 



33 

just in fact when it would be most convenient to perform ma^s modification. 
\VOiild probes be conditioned. 

Probe strands may also be mass modified subsequent to 
^ synthesis by, for example, contacting by treating the extended strands with 
5 an alkylating agent, a thioiating ^ent or subjecting the nucleic acid to cation 
exchange. . Nucleic acid which can be modified include target sequences, 
probe sequen<^ and strands, extended strands of the probe and other 
available fragments. Probes can be mass modified on either strand prior to 

hybridization. Such arrays of mass modified or conditioned nucleic acids 
10 can be bound to fragments containing the target sequence with no 
interference to the fideiity of hybridization- Subsequent extension of either 
strand of the probe, for example using Sanger sequencing techniques, and 
using the target sequences as templates will create mass modiiled extended 

• . strands. The molecuiar weights of these strands can be determined with 

is exceilent accuracy. 

Probes may be in solution, such as in wells or on the surface 
of a micro-tray, or attached to a solid support. Mass modification can occur 
white the probes arc fixed to the support, prior to fixation or upon cleavage 
from the support which can occur concurrently witii ablation when analyzed 

20 by mass spectrometry, in this regard, it can be important which strand is 
released from the support upon laser ablation. Preferably, in such cases, the 
proU is differentially attached to the support One strand may be permanent 
and the other temporarily attached or, at least, selectively releasable. 

Examples of solid supports which can be used include a 

25 plastic, a ceramic, a metal, a restn, a gel and a membrane. Useful types of 
soiid supports include plates, beads, microbeads, whiskers, combs. 



34 

hybridization chips, membranes, single crystals, ceramics and self- 
assembling monolayers. A preferred mibodiment comprises a two- 
dimensional or three-dimensional matrix, such as a gel or hybridization chip 
with multiple probe binding sites (Pevzncsret aL, J. Biomol. Struc. & Dyn. 
5 9:399-410, 1 99 l;Maskos and Southern, Nuc, Acids Res. 20: 1679-84. 1992). 
Hybridization chips can be used to construct very large prc^e arrays which 
are subsequently hybridized with a target nucleic acid. Analysis of the 
hybridization pattern of the chip can assist in the identification of the target 
nucleotide sequence. Patterns can be manually orcomputer analyzed, but 
10 it is clear that positional sequencing by hybridization lends itself to 
computer analysis and automation. Algorithms and software have been 
developed for sequence reconstruction which are applicable to tlie methods 
described herein (R. Drmanac et al, J. Biomol Struc. & Dyn. 5:1 085-1 1 02, 
1991; P. A. Pevzner, J. BiomoL Stnic, 8l Dyn, 7:63-73, 1969), 

Nucleic acid probes may be attached to the solid support by 
covalent binding such as by conjugation with a coupling agent or by, 
covalent or non-covalent binding such as electrostatic interactions, hydrogen 
bonds <K antibody-antigen coupling, or by combinatiOTs thereof. Typical 
coupling agents include biotin/avidin. biotin/streptavidin. Staphylococcus 
20 aureus protein AAgG antibody fragment, aixi streptavidin/protein A 
chimeras <T. Sano and C.R, Cantor, BioyTechnoiogy 9: 1 378-8 U 1 991 ), or 
d«4vatives or combinations of these agents. Nucleic acids may be auached 
to the solid support by a photocleavable bond, an electrostatic bond, a 
disulfide bond, a peptide bond, a diester bond or a combination of diese sorts 
25 of bonds. The array may also be attached to the solid support by a 
selectively releasable bond such as 4,4'-'dinicihoxytricyl or its derivative. 



35 

Derivatives which have been found to be useful include 3 or 4 fbis-(4- 
methDxypbenyl>>methyI-bcn26tc acid, N-succinimidyi" 3 or 4 [>is-(4. 
methoxyphenyl)]-methyl.bcnzoic acid, N-succinimidyl- 3 or 4 [bxs-(4- 
metho>ophenyl)]-hydroxyinethyKbenzoic acid, N-succinimidyl- 3 or 4 [bis- 
5 (4-methoxyphenyI)].ciiloromethyl-beiizoic acid, and sal^^ 

, Binding may be reversible or permanent where strong 
associations would be critical. In addition, probes may be attached to solid 
supports via spacer moieties between the probes of the anray and the solid 
support, UsefuJ spacers include a coupling agent, as described above for 
10 binding to other or additional coupEing partn^^s, or to render the attachment 
to the solid support cieavable. ■• 

CleavabJe attachmenEs may be created by attaching cieavabie 
chemical moieties between the probes and the solid support such as an 
oUgopeptide, oiigonucleotide, oligopoiyamide, oligoacrjiamidc. 
15 oligoethyiene glycerol, alkyi chains of between about 6 to 20 carbon atoms, 
and combinations thereof These moieties may be cleaved with added 
chemical agents, electromagnetic radiation or enzymes. Examples of 
attachments cieavable by enzymes include peptide bonds which can be 
cleaved by proteases and phosphodiester bonds which can be cleaved by 
20 nucleases. Chemical agents such as P-mercaptoethanoi, dithiothreitol (DTT) 
and other reducing agents cleave disulfide bonds. Other agents which may 
be useful include oxidizing agents, hydrating agents and other seJectiveiy 
active compounds. Eiectromagnetic radiation such as ultraviolet, infrared 
and visibie light ckave photocicavabb bonds. Anachments may also be 
25 reversible such as, for example, using heat or enzymatic treatment, or 



36 

reversible chemical or magnetic attachments. Rel^e and reattachment can 
be performed using, for example, magnetic or electrical fields. 

Hybridized probes can provide direct or indirect information 
about the hybridized sequence. Direct information may be obtained from 
the binding pattern of the array wherein probe sequences are icnown or can 
be determined. Indirect information requires additional analysis of a 
plurality of nucleic acids of the target array. For example, a specific nucleic 
acid sequence will have a unique or reiatively unique molecular weight 
depending on its size and composition. That molecular weight can be 
determined, for example, by chromatography {e.g. HPLC), nuclear magnetic 
resonance (NMR), high-definition gei electrophoresis, capillary 
electrophoresis (e.g. HPCE), spectroscopy or mass spectrometry. 
Preferably, molecular weights arc determined by measuring the mass/charge 
ratio with mass spectrometry technology. 

Mass spectrometry of biopolymers such as nucleic acids can 
be performed using a variety of techniques (e.g. U.S. Patent Nos. 4,442,354; 
4.931,639; 5002,868; 5,130,538;5, 135,870; 5,174,962). Dimcultics 
associated with volatization of high molecular weight molecules such as 
DNA and RNA have been overcome, at least in part, with advances in 
techniques, procedures and elecuronic design. Further, only small quantities 
of sample are needed for analj^ts, the typical sample being a mixture of 10 
or so fragments. Quantities which range frmn between aboui 0. 1 femiomole 
to about I.O nanomole, preferably between about 1.0 femiomole to about 
1000 femtomoles and more preferably between about JO femtomoles to 
about iOO femtomoles are typically sufficient for analysis. These amounts 



37 

can be easily placed onto the individual positions of a suitable surface or 
attached to a support. 

Another of the important features of this inv€ait!on is ih^^ 
unnecessary to volatize large lengths of nucleic acids to determine sequence 
5 information. Using the methods of the inventibn, segments of the nucleic 
acid target, discretely isolated into separate complexes on the target array, 
can be sequenced and those sequence segment collated making, it 
unnecessary to have to volatize the entire strand at once. Techniques which 
can be used to volatize a nucleic acid flragment include fast atom 
10 bombardment, ^ plasma desorption, matrix-assisted laser 
desorptioa/ionization, electrospray, photochemical release, elecmcal release, 
droplet release, resonance ionization and combinations of these techniques . 

In eiectrohydrodynamic ionization, thermospray, aerospray 
and electrospray, the nucleic acid is disso]%xd in a solvent and injected with 
1 5 the help of heat, air or electricity, directly into the ionization chamber. If the 
method of ionization involves a light beam* particle beam or electric 
discharge, the sample may be attached to a surface and introduced into the 
ionization chamber; In such situations, a plurality of samples may be 
attached to a single surface or multiple surfaces and introduced 
20 simullaneousiy into the ionization chamber and still analyzed individually. 
The appropriate sector of the surface which contains the desired nucleic acid 
can be moved to proximate the path an ionizing beam. After the beam is 
pulsed on and the surface bound molecules are ionized, a different sector of 
the surface is moved into the patii of tiie beam and a second sample, with the 
25 same or different molecule, is analyzed without reloading ihc machine. 
Multiple samples may also be introduced at eiectrtcaliy isoialed regions of 



38 

a siir&cc. DifFereait sectors of the chip are connected to an eiectrical source 
and ionized individually. The surface to vMch the sample is attached may 
be shaped for maximum efficiency of the ionization method used. For field 
ionization and field desorption, a pin or sharp edge is an efficient solid 
support and for particle bombardment and laser icHiization, a flat surface* 

The goal of ionization for mass spectroscopy is to produce a 
whole molecule with a charge. Preferably, a matrix-assisted laser 
desorption/ionization (MALDI) or eJcctrospray (ES) mass spectroscopy is 
used to determine molecular weight and, thus, sequence information from 
the target array. It wil! be recognized by those of ordiBar>' skill thai a 
variety of methods may be used which are appropriate for large molecules 
such as nucJeie adds. Typically, a nucleic acid is dissolved in a solvent and 
injected into the ionization chamber using electrohydrodynamic ionization, 
themiospray, aerospray or elcctrospray. Nucleic acids may also be attached 
to a surface and ionized with a beam of particles or light. Particles which 
have successfully used include plasma (plasma desorption), ions (fast ion 
bombardment) or atoms (fast atom bombardment). Ions have also been 
produced with the rapid application of laser energy (laser desorption) and 
electrical energy (field desorption). 

In mass spectrometer analysis, tlie sample is ionized briefly by 
a pulse of laser beams or by an electric field induced spray. The ions are 
accelerated in an electric field and sent at a high velocity into the analy2Er 
portion of the spectrometer. The speed of the accelerated ion is directly 
proponional lo the charge {2) and inversely proportional to the mass (m) of 
the ion. The mass of the molecule may be deduced from the flight 
chamcteristics of its ion. For small ions, the typical detector has a magnetic 



39 

field which functions to constrain the ions stream into a circular path. The 
radii of the paths of equally diarged particles in a uniform magnetic fieJd is 
directly proportional to mass. That is, a heavier particle with the same 
charge as a lighter particle will have a larger flight radius in a magnetic 
field. It is generally considered to be impractical to measure the flight 
characteristics of large ions such as nucleic acids in a magnetic field because 
the relatively high mass to charge (m/z) ratio requires a magnet of unusual 
size or strength. To overcome this limitation the electrospray method, for 
example, can consistently pJace multiple ions on a molecule. Multiple 
charges on a nucleic acid will decrease the mass to charge ratio allowing a 
conventional quadrupole analyzer to detect species of up to 100,000 daltons: 
Nucleic acid ions generated by the matrix assisted laser 
desorption/ionization only have a unit charge and because of iheir large 
mass, generally require analysis by a time of tlight analyzer. Time of ilight 
analyzers are basically long tubes with a detector ai one end- In the 
operation of a TOP analyzer, a sample is ionized briefly and accelerated 
down the tube. After detection, the time needed for travel down the detector 
tube is calculated. TTks mass of the ion may be calculated from the time of 
flight. TOF analyzers do not require a magnetic field and can detect unit 
charged ions witli a mass of up to 1 00,000 dattons. For improved resolution, 
the time of flight mass spectrometer may include a reflectron, a region at the 
end of the flight tube which negatively accelerates ions. Moving particles 
emering the refiectron region, which contains a field of opposite polarity to 
the accelerating field, are retarded to zero speed and tlicn reverse accelerated 
out with the same speed huT in the opposite direction. In the use of an 
analyzer with a reflectron, the detector is placed oti the same side of the 



40 

flight tube as the ion source to delect the returned ions and the etfecttve 
length of the flight tube and the resoiution power is effectively doubled. 
The calculation of mass to charge ratio from the time of flight data takes into 
account of the time spent in the reflcctron, 

. 5 Ions with the same charge to mass ratio will typically leave the 

ion accelerators with a range of energies because the ionization regions of 
a mass spectrometer is not a point source. Ions generated further away from 
the night tube, spend a longer time in the accelerator field and enter the 
flight tube at a higher speed. Thus ions of a single species of molecule will 

10 arrive at the detector at different times. In time of flight analysis, a longer 
time in the flight tube in theory provide more sensitivity, but due to the 
different speeds of the ions, the noise (background) wiIJ also be increased. 
A reflectron, besides effectively doubling the effective length of the flight 
tube, can reduce the error and increase sensitivity by reducing the spread of 

1 5 detector impingement time of a single species of sons. An ion with a higher 
velocity will enter the reflectron at a higher velocity and stay in the 
reflectron region longer than a lower velocity ion. If the reflectron electrode 
voltages are arranged appropriately, the peak width contribution from the 
initial velocity distribution can be largely conrected for at the plane of the 

20 detector, Tlie correction provided by the reflectron leads to increased mass 
resolution for all stable ions, those which do not dissociate in flight, in the 
spectrum. 

While a linear Held reflectron functions adequaiely to reduce 
noise and enhance sensitivity, reflectrons with more complex Held strengths 
25 offer superior correctional abilities and a number of complex reflecirons can 
be used. ThQ double stage reflectron has a first region with a weaker electric 



41 

field and a second region with a stronger electric field. The quadratic and 
ihe curve field reflectron have a eiectdc field which mcreases as a function 
of the distance. These functions, as their name implies, may be a quadratic 
or a complex exponential function. The dual stage, quadratic, and curve 
field reflectrons, while more elaborate are also more atxurate than the linear 
reflectron. 

The detection of ions in a mass spectrometer is typically 
performed using electron detectors. To be detected, the high mass ions 
produced by the mass spectrometer is converted into either electrons or low 
mass ions at s conversion electrode. These electrons or low mass ions are 
then used to start the electron multiplication cascade in an electron 
muitipiier and further amplified with a fast linear amplifien The signals 
from multiple analysis of a single sample are combined to improve the 
signal to noise ratio and the peak shapes, which also increase the accuracy 
of the mass determination. 

This invention is also directed to the deiection of multiple 
primary ions directly through the use of ion cyclotron resonance end Fourier 
analysis. This is useful for the analysis of a complete sequencing ladder 
immobilized on a surface. In this method^ a plurality of samples are ionized 
at once and die ions are cajfrfured in a cell wiih a high magnetic field. An RP 
field excites the population of ions into cyclotron orbits. Because the 
frequencies of the orbits are a functicHi of mass, an output signal 
representing the spectrum of the ion masses is obtained. This output is 
analyzed by a computer using Fourier analysis which reduces tlie combined 
signal to its component frequencies and thus provides a measurement of the 
ion masses present in the ion sample. Ion cyclotron resonance and Fourier 



42 

analysis can detersnmc th^ masses of all nucleic acids In a sample. The 
application of this method is especially useful on a sequencing ladder. 

The data jfrom mass sipectrometry, either performed singly or 
in parallel (multipiexed), can determine the molecular mass of a nucleic acid 
sample. The molecular mass, combined with the known sequence of the 
sample, can be analyzed to determine the iengdi of the sample. Because 
diflerent bases have different molecular weight, the output of a high 
resolution mass spectrometer, combined with the known sequence and 
reaction histor>-^ of the sample, will determine the sequence atid length of the 
nucleic acid analyzed. In the mass spectroscopy of a sequencing ladder, 
generally the base sequence of the primers are known. From a known 
sequence of a certain length, the added base of a sequence one base longer 
can be deduced by a comparison of the mass of the two molecules. This 
process is continued until the complete sequence of a sequencing ladder is 
determined. 

Another embodiment of the invention is directed to a method 
for detecting a target nucleic acid. As before, a set of nucleic acids 
complementary or homologous to a sequence of the target is hybridized to 
an array of nucleic acid probes. The molecular weights of the hybridized 
nucleic acids determined by, far examine, mass specu-ometry and the nucleic 
acid target detected by the presence of its sequence in the sample, As the 
object is not to obtain extensive sequence information, probe arrays may be 
fairly small witli Ute critical sequences, the sequences to be detected, 
repeated in as many variations as possible. Variations may have greater tlian 
95% homology to the sequence of interest, greater than S0%, greater than 
70Va or greater than about 60%. .Variations may also have additional 



43 

sequences not required or present in the target sequence to increase or 
decrease the degree of hybridization. Sensitivity of the airay to the target 
sequence is increased while reducing and h<^fiiliy eliminating the number 
of falsp positives. 

^ Target nucleic acids to be detected may be obtained from a 

biological sample, an archival sample, an environmental sample or another 
source expected to contain the target sequence. For example, samples may 
be obtained from biopsies of a patient and the presence of the target 
sequence is indicative of the disease or disorder such as, for example, a 
0 neoplasm , or an infection. . . Samples may also be obtained from 
: environmental sources such as bodies of water, soil or waste sites to detect 
the presence and possibly identify organisms and microorganism which may 
be present in the sample. The presence of parttcuJar microorganisms in the 
sample may be indicative of a dangerous pathogen or thai the normal flora 
15 is present. 

Another embodimeiit of the invention is directed to iJie arrays 
of nucleic acid probes usefiil in the above-described methods and 
procedures. These probes comprise a first strand and a second strand 
wherein the first strand is hybridized to the ^cond strand forming a doubJe- 

20 stranded portion^ a single-stranded portion and a variable sequence within 
the single-stranded portion. The anray may be attached to a solid support 
such as a material that facilitates volatization of nucleic acids for mass 
spectrometry. TypicaUy, arrays comprise large numbers of probes such as 
less than or equal to about 4^ different probes and R is the iength in 

25 nucleotides of the variable sequence. When utiiidng arrays for large scale 
sequencing, larger arrays can be used whereas, arrays which are used for 



44 

detection of specific sequences may be fairly smal] as many of the potential 
sequence combinations will not be necessary. 

Arrays may also comprise nucleic acid probes which aire 
entirely single-stranded and nucleic acids which are single-stranded, but 
5 possess hairpin loops which create double-stranded regions. Such structures 
can fttnction in a manner similar if not identical to the partialiy single- 
stranded probes, which comprise two strands of nucleic acid, and have the 
additional advantage of thermodynamic oiergy available in the secondary 
structure, 

Arrays may be in solution or fixed on a solid support through 
streptavidin-biotin interactions or other suitable coupling agents. Arrays 
may also be reversibiy fixed to the solid support using, for example, 
chemical moieties which can be cleaved with electromagnetic radiation, 
chemical agents and the like. The solid support may comprise materials 
1 5 such as matrix chemicals which assist in the voiatization process for mass 
spectrometric analysis. Such chemicals include nicotinic acid, 3'- 
hydroxypicolnic acid, 2,5"dihydroxyben2oic acid, sEnapinic acid, succinic 
acid, glycerol, urea and Tris-HCl pH ^bout 7.3 . 

Another embodiment of the invention is directed to 
20 sequencing double-stranded nucleic acids using strand-displacement 
polymerization, With this method it is unnecessary to denature the double- 
strands to obtain sequence information. Strand-displacement polymerization 
creates a new strand while simultaneously displacing the existing strand. 
Techniques for incorporating label into the growing strand arc weiJ^know 
25 and the newly polymerized strand is easily detected by, for example, mass 
spectrometry. 



45 



Target nucleic acid or nucleic acids containing sequences that 
correspond to Jhe sequence of the target are digested, for example, with 
restriction enzymes, in one or more steps to create a set of fragments which 
are partially single-stranded and partially double-stranded. Another set of 
5 nucleic acids, the probes. ar« also partially single-stranded and partially 
doubJe-stranded. These probes preferably contain a variable or constant 
regions within the single-stranded portion of the terminus of each fragment 
{5'- or a'-overhangs). Probes or ftagments arc treated with a phosphatase lo 
remove phosphate groups from the 5'-termini of the nucleic acids. 
10 Phosphatase treatment prevents nucleic acid ligation by ligase which 
requires a terminal 5'-phosphate to covalentJy link to a 3'-hydroxyl. Single- 
stranded regions of the fragments are hybridized to single-stranded regions 
of the probes, forming an array of hybridized target/probe complexes. 
Adjacent or abutting nucleic acid strands of the complex are ligated, 
15 covalently joining a strand of the. fragment to a strand of the probe.' 
Phosphatase treatment prevents both self-ligation of phosphatase-treated 
nucleic acids and ligation between the S'-termini of phosphatased nucleic 
acids and the 3'.tennini of untreated nucleic acids. These complexes are 
treated with a nucleic acid polymerase that recognizes and bind to the nick 
20 in the unhgated strand to inhiate polymerization. TTie polymerase 
synthesizes a new strand using the ligated stand as a template, while 
displacing the complementary strand. The reaction may be supplemented 
with labeled or mass modified nucleotides (e.g. mass modifications at 
positions C2, N3. N7 or C8 of purine, or at N7 or N9 of deazapurine) or 
25 other detectable markers that will allow for the detection of new synthesis. 
Either the probes or the fragments may be fixed to a solid suppon such as 



46 

a plasHc or gJass surface, membrane or structure (magnetic bead) which 
eliminates the need for repetitive extractions or other purification of nucleic 
adds between steps* 

Prcferabiy, doubJe-stranded nucleic acids containing target 
5 sequences are obtained by polymerase chain reaction or enzymatic digestion 
(e,g. restriction en2:>^mes) of the target sequence. Target sequences may be 
DNA, RNA, RNA/DNA hybrids, cDNA, PNA or modifications or 
combinations thereof and are preferably from about 10 to about 1,000 
nucleotides in length, more preferably, from about 20 to about 500 
10 nucleotides in length, and even more preferably, &om about 35 to about 250 
nucleotide in length. 5'-termini of the nucleic acid fragments or probes may 
be dephosphorylated with a phosphatase, such as alkaline or caJf intestinal 
phosphatase, which eliminates the action of a nucleic acid ]igase. Upon 
hybridization of fragment to probe, only one of the two internal 5^-3* 
15 junctions contains a 5'-phosphate and is capable of Jigation. The second 
junction appears as a nick in a strand of the compJex. Nucleic acid 
polymerases, such as Klenow. recognize the nick and synthesize a ncv%' 
strand while displacing the complementary, ligated strand. Chain elongation 
can proceed in the presence of. for example, nucleotide triphosphates and 
20 chain temiinating nucleotides. Nucleic acid synthesis tenmnaies when a 
dideoxynucleotide is incorporated into the elongating strand. The resulting, 
fragments represeni a nested set of the sequence of the target. Precursor 
nucleotides may be labeled with, for example, mass modifications. The 
mass modified fragments can be easily analyzed by mass spectrometry to 
25 determine the sequence of the target. Complexes may further comprise 
. singJe-stranded binding protein (SSB; £. co/i) which increases stability of 



47 

the complex and facUitate polymerase acUon. Bands otherwise obscured are 
more «^ily detected. SSB can be used to sequence fragments of greater 
than 100 nucleotides, preferably greater than 150 nucleotides and more 
preferably greater than 200 nucleotides. 
5 This method is generally useful for manual or automated 

nucleic acid sequencmg, and especially useful for identifying and 
sequencing a single or group of nucleic acid species in a mixed background 
containing a pjuralit)' of species of different sequences. In this method, 
selection is performed upon hybridization and ligation of fragments to 
10 probes. Probes may be designed to contain a common or variable sequence 
within the singic-stranded region that is complementary to a sequence of the 
.fragment to be identified and, if desired, sequenced. Stringency of 
fragment/probe hybridization can be adjusted by metliods well-known to 
those of ordmary skill to match desired conditions of selection. For 
15 example, the single-stranded region of the probe can be designed to contain 
a specific sequence only found on the single-stranded region of the nucleic 
acid fragment of interest. Alternatively, multiple prober containing multiple 
variable regions may be used to select for those fragment sequences which 
may be longer than the length of the single-stranded region of any one 
20 probe. Hybridization and ligation selects the specific fragment from a 
complex mixture of different fragments and only that specific fragment is 
subsequently sequenced. 

Probes are typically from about 1 5 to aboui 200 nucleotides 
in length, but can be larger or small depending on the particular appitcaiion. 
25 Single-stranded regions of the probes may beaboui 3,4, 5,6, 7, 8, 9, )0, 12, 
1 5. 20, 22. 25 or 30 nucleotides in length or larger. For probes containing 



48 

a variable region within the single-stratided region, the length of this 
variable region may be the same or smaller than the length of the entire 
single-stranded portion. Variable regions may be distinct between probes 
or common within sets of probes- The double-stranded region of the probe 
5 is topically larger than the single-stranded region and may be about 4, 5, 6, 
7, 8, 9, 10, 12, 14, 16. 18, 20, 22, 24, 26, 28, 30, 35 40 or 5.0 nucleotides in 
length or larger. Probes may also be modified to facilitate attachment to a 
solid support or other surfaces, or modified to be individual delectable for 
ideJitification or other purposes. Sets of nucleic acids, either fragments or 
10 probes, preferably contain greater than 10^ 10\ 10\ iO^ 10* 10\ 10^ 10' 
or 10'* different members. 

Another embodimeni of the invention is directed to kits for 
detecting a sequence of a target nucleic acid. An array of nucleic acid 
probes is fixed to a solid support which may be coaled with a matrix 
1 5 chemical that facilitates voiatization of nucleic acids for mass spectrometry. 
Kits can be used to detect disea^ and disorders in biological samples by 
delecting specific nucleic acid sequences which are indicative of the 
disorder. Probes may be labeled with detectable labels which only become 
detectable upon hybridization with a coirectly matched target sequence. 
10 Detectable labels include radioisotopes, metals, luminescent or 
bioJuminescent chemicals, fluorescent chemicals, enzymes and 
combinations thereof 

Another embodiment of the invention is directed lo nucleic 
■ acid sequencing systems which comprise a mass spectrometer, a computer 
:5 loaded with appropriate software for analysis of nucleic acids and an array 



49 

of probes which can be used to capture a target nucleic acid sequence. 
Systems may be manual or automated as desired. 

The following experiments ai« offered to illustrate 
embodiments of the invention, and should not be viewed as limiting the 
scope of the invention. 
Examples 

^^P^^ 1 P r^oaration of Target MnH^ir ^ 

Target nucleic acid is prepared by restriction ea^donuclease 
cJeavage of cosmid DMA. The properties of type.H and other restriction 
nucleases that cleave outside of their recognition sequences were exploited. 
A restriction digestion of a 10 to 50 kb DNA sample with such an enzyme 
produced a mixture of DNA fragments most of which have unique ends. 
Recognkion and cleavage sites of useful enzymes are shown in Table 1 . 

Table 1 

Restriction Enzymes and Recognition Sites for PSBH 

J 

A/wo/ GCNNNNN--NNGC 
CGNN-NNNNNCG 
1 

i 

EsiYI CCNNNNN-IWGG 
GGhnsJ-N>fNNNCC 
I 

i 

Apa BI GCANNNNN-TGC 
CGT-NNNNNACG 

f 

J 

Mnil CCTCN, 
GGAGN^ 



10 



50 

TspRI NNCAGTGNN 
NNGTCACNN 
T 

1 

^ CCANNNNNN-GTNNNN 
GGTNNNNNN-CANNNN 

\ 

I 

0« PI CCANNNNN-NNTCNN 
GGTNNNhRvl-NNAGNN 
t 



One restriction enzj-me, ApaB 15, with a 6 base pair 
recognition site may also be used. DNA sequencing is best served by 
15 enzymes that produce average fragment lengths comparable to the lengths 
of DNA sequencing ladders analyzabie by mass spectrometry. At present 
these lengths are about 100 bases or less. 

BsiYl and A/wo I re^ctitm endonucleases are used together 
to digest DNA in preparation of PSBH. Target DNA from is cleaved to 
20 completion and complexed with PSBH probes either before or after melting. 
The fraction of fragments with unique ends or degenerate ends depends on 
the complexity of the target sequence. For example, a 10 kilobase clone 
would yield on average 1 6 frag^nents or a total of 32 ends since each double- 
stranded DNA target produces f^^.o ligatable 3' ends. With 1024 possible 
25 ends, Poisson statistics (Table 2) predict that there would be 3% 
degeneracies. In contrast, a 40 kilobase cosmid insert would yield 64 
fragments or 128 ends, of which. 12% of these would be degenerate and a 
50 kilobase sample would yield 80 fragments or 160 ends. Some of these 
would surely be degenerate. Up to at least ! 00 kilobase. the larger the target 
the more sequence arc available from each multiplex DNA sample 



30 



• • • 

• m 

* mm 



»*«• 



51 

preparation. With a 100 kilobase target, 27% of the targets wouEd be 
degenerate. 

Table 2 

Poisson Distribufton of Rjestricdoii Enzy me Sites 
Target size Mml TspRl 

(kb) Sequencing A^embiy Sequencing Assembly 

10 0.97 0.60 . 0.94 0.94 

40 . 0.88 0.14 0.80 0.80 

too 0.73 0.01 0.57 0.57 



10 



With BsiY I and Mwp I, any restriction site that yields a unique 
5 base end may be captured twice and the resulting sequence data obtained 
win read away &om the site in both directions (Figure 5). With the 
knowledge of three bases of overlapping sequence at the site, thiis sorts alJ 

1 5 sequences into 64 different categories. With 1 0 kiiobase targets, 60% wH! 
contain fragments and, thus sequence assembly is automatic. 

Two array capture methods can be used with Mwo I and BsiY 
L In ihe first method, conventional five base capture is used. Because the 
two target bases, adjacent to the capture site are known, they from the 

20 restriction enzyme recognition sequence, an alternative capture slrateg>^ 
would build the complement of these tvs^o bases into the capture sequence. 
Seven base capture is tiicrmodynamicaily more stable, but less 
discriminating against mismatches. 

TspR I is another commercially avaiiabie restriction enzyme 

25 with properties that are very atu-active for use in PSBH-mediated Sanger 
sequencing. The method for using TspR I is shown in Figure 6. TspR \ has 
a five bas^ recognition site and cuts two bases outside this site on each 
strand to yield nine base 3' single-stranded overhangs. These can be 
captured with partially duplex probes with compIememar>' nine base 



BNSDOCID; <AU 7SB464B2.J._> 



52 

overhangs. Because only four bases are noi specified by enzyme 
recognition, TspR I digest results in only 256 types of cleavage sites! With 
human DNA the average fragment length that results is 1370 bases. This 
enzyme is ideal to generate long Sequence ladders and are useful to input to 
5 long thin gel sequencing where reads up to a kiiobase are common. A 
typical human cosmid yields about 30 TspR I fragments or 60 ends. Given 
the length distribution expected, many of these could not be sequenced fiilly 
from one end. With 256 possible overhangs, Poisson statistics (Table 2) 
indicate that m% adjacent fragments can be assembfed with no additional 
10 labor. Thus, very long blocks of continuous DNA sequence are produced. 

Tliree additional restriction enzymes are also useful. These 
are Mnl I Qe I and CjeP I (TabJe i ). The first has a four base site with one 
AH-T should give smalicr human DNA fragments on average than Mwo I or 
BsiYl. The latter two have unusual interrupted five base recognition sites 
15 and might supplement Tip^ L 

Target DNA may also be prepared by tagged PGR. It is 
possible to add a preselected five base 3' terminal sequence to a target DNA 
using a PGR primer five bases longer than the known target sequence 
priming site. Samples made in this way can be captured and sequenced 
20 using the PSBH approach based on the five base tag. A biotin was used to 
allow purification of the complementary^ strand prior to use as m 
immobifized sequencing template. A biotin may.also be placed on the tag. 
After capture of the duplex PGR product by streptavidin-coated magnetic 
microbeads, the desired strand {needed to serve as a sequencing template) 
25 could be denatured from the duplex and used to contact the entire probe 
array. For muhiplex sample preparation, a series of different five base 



53 

tagged primers would be employed, ideally in a single multiplex PCR 
reaction This approach also requires knowing enough target sequence for 
unique PCR amplification and is more usefui for shotgun sequencing or 
comparative sequencing than for de novo sequencing, 
5 Example 2 AsPWs of Positional Seouencinp hy HvhHHW^tfr.p 

An examination of the potential advantages of stacking 
hybridization has been carried put by both calculations and pilot 
experiments. Some calculated T„.'5 for perfect and mismatched duplexes are 
shown in Figure 7, These are based on average base compositions. The 
10 calculations revealed that the binding of a second oligomer next to a pre- 
formed duplex provides an extra stability equal to about two base pairs and 
that mis-pairing seems to have a larger consequence on stacking 
hybridi7.ation than it does on ordinary' hybridization. Other types of mis- 
pairirtg are less destabilizing, but these can be eliminated by requiring a 
15 ligation step. In standard SBH, a termina! mismatch is the least 
destabilizing event, and leads to the greatest source of ambiguity or 
backgKiund. For an octanucleotide complex, an average terminal mismatch 
leads to a 6*C lowering in T„. For slacking hybridization, a terminal 
mismatch on the side away from the pre-existing duplex, is the least 
20 destabilizing event. For a pentamer, this leads to a drop in T„ of IC^C. 
Hicsc considerations indicate that the discrimination power of stacking 
hybridization in favor of perfect duplexes are greater than ordinary SBH, 
Examples Preparation of ^oHr! Arrnyf; 

In a single synthesis, all 1 024 possible singie-stmnded probes 
25 with a constant 18 base stalk followed by a variable 5 base extension caji be 
created. The 18 base extension is designed to contain two restriction 



54 

enzyine cutting sites. Jifea /generates a 5 base, 5' overhang consisting of the 
variable bases N^. Not I genrates a 4 base, 5' overhang at the constant end 
of tiie oligonucleotide. Hie synthetic 23-mer mixture hybridized with a 
complementary Ig-mer fornis a duplex which can be en2ymatically 
5 extended to form all 1024, 23^mcr duplexes. These arc cloned by, for 
example, blunt end ligation, into a plasmid which lacks Not I sites. Colonies 
containing the cloned 23-base insert are selected and each clone contains 
one unique sequence. DNA minipreps can be cut at the constant end of the 
. staik, filled in with biotinylated py rim i dines and cut at the variable end of 
10 the stalk to generate the 5 base 5^ overhang. The resulting nucleic acid is 
fractionated by Qiagen columns (nucleic acid purification columns) to 
discard the high molecular weight material. The nucleic acid probe wiii then 
be attached to a streptavidin-coated surface- This procedure could easily be 
automated in a Beckman Biomec or equivalent chemical robot to produce 
15 many identical arrays of probes. 

The initial aixay contains about a thousand, probes. The 
particular sequence at any location in the array will not be known. 
However, the array can be used for statistical evaluation of the signal to 
noise ratio and the sequence discrimination for different target moiecules 
20 under different hybridization conditions. Hybridization witii known nucleic 
acid sequences allows for the identification of particular elements of the 
array. A sufficient set of hybridizations would train the array for any ' 
subsequenj sequenciag task. Arrays are partially characterized untH they 
have the desired properties. For example, the length of the oligonucleotide 
25 duplex, the mode of its auachment to a surface and the hybridization 
conditions used can all be varied using the initial set of cloned DNA probes. 



55 

Once the sort of array tJiat works best is determined, a complete and fiilly 
characterised array can be constructed by ordinary chemical synthesis. 
Example 4 Preparation of Specific Pt^h^ Ar rm 

With positional SBH, one potential trick to compensate for 
5 some variations in stability among species due to GC content variation is to 
provide GC rich stacking duplex adjacent AT rich overhangs and AT rich 
stacking dupiex adjacent GC rich overhang. Moderately dense arrays can 
be made using a typical x^y robot to spot the biotinylated compounds 
mdividuaJiy onto a streptavidm-coated surface. Using such robots, it is 
10 ■ possible to make arrays of 2 x 1 0'^ samples in 100 lo 400 cm^ of nomina! 
, .surface. Commercially available streptavidin^coaled beads can be adhered, 
perinanentiy to plastics like polyst^TCne. by exposing the plastic H^^^ 
brief treatment with an organic solvent like triethylamine. Tlie resaking 
plasUc surfaces have enormously high biotin binding capacity because of the 
15 very high surface area that results. 

Irk certain experiments, the need for attaching oligonucleotides 
to surfaces may be circumvented altogether, and oligonucleotides attached 
to streptavidin^oated magnetic microbeads used as already done in pilot, 
experiments. The beads can be manipulated in microtiter plates. A 
20 magnetic separator suitable for such plates can be used including, the newly 
available compressed plates. For example, the 18 by 24 weli plates 
(Genetix, Ltd,; USA Scientific Plastics) would allow containment of the 
entire array m 3 plates. This format is xveli handled by existing chemical 
robots. It is preferable to use the more compressed 36 by 48 well formal so 
25 the entire array would fit on a single pbic- The advantages of this approach 
for all Lhe experimems are that any poicniiai compiexities from surface 



56 

effects can be avoided and alreadyexisting liquid handling, thermal control 
and imaging methods can be used for all the experiments. 

Lastly, a rapid and highly efiRcient method to print arrays has 
been developed. Master arrays are made which direct the preparation of 
5 replicas or appropriate compicmentary arrays. A master array is made 
manually (or by a very accurate robot) by sampling a set of custom DNA 
sequences in the desired pattern and then transferring these sequences lo the 
replica, Ttie master anray is just a set of all 1 024^4096 compounds printed 
by multiple headed pipettes and compressed by offsetting. A potentiayy 
10 more elegant approach is shown. in Figure 8. A master array is made and 
•used to transfer components of the replicas in a sequence-specific way. The 
sequences to be transferred are designed to contain the desired 5 or 6 base 
5' variable overhang adjacent to a unique 1 5 base DNA sequence. 

The master array consists of a set of streptavidin bead- 
15 impregnated plastic coated meta] pins. Immobilized biotinyiated DNA 
su-ands that consist of the variable 5 or 6 base segment plus the constant ] 5 
base segment are at each tip. Any unoccupied sites on this surface are filied 
with excess free biotin. To produce a replica chip, the master array is 
incubated with the complement of the J 5 base constant sequence, 5'-iabeled 
20 with biotin. Next, DNA polymerase is used lo synthesize the complement 
of the 5 or 6 base variable sequence. Then the wet pin array is touched to 
the streptavidin-coated surface of the replica and held at a temperature above 
the of the complexes on the master array. If there is insufficient liquid 
carryover from the pin array for efficient sample transfer, the replica array 
25 could first be coated with spaced dropleis of solvent, either held in concave 
caviiies or delivered by a multi-head pipeuor. After the transfer, the replica 



57 

chip is incubated with the complement of 15 base constant sequence to 
reform the double-stranded portions of the array. The basic advantage of 
this scheme is that the master array and transfer compounds are made only 
once and the manufacture of replica arrays can proceed almost endlessly. 
5 Example s Attachment nf Nncleic A.;^. p^ -^^^c to Soliri Rnpp ^^e 

Nucleic acids may be attached to silicon wafers or to beads. 
A silicone solid support was derivatized to provide iodoacetyl functionalities 
on its surface. Derivallzed solid support were bound to disulfide containing 
oligodeoxynucleotides. AJtematively, the solid support may be coated with 
10 steptavidiiioravidinandboundtobiotinylatedDNA. 

Covaient attachment of oligonucieotide to derivatized chips: 
Silicon wafers are chips with an approximate weight of 50 mg. To maintain 
uniform reaction condition, it was necessary to detennine the exact weight 
of each chip and select chips of simitar weights for each experiment. The 
1 5 reaction scheme for this procedure is shown in Figure 9. 

To dcrivaiize tlie chip to contain the iodoacetyl functionality 
an anhydrous solution of 25% (by volume) 3-aminopropyltrieshoxysilane 
in toluene was prepared under argon and aliquotted (700 jil) into tubes. A 
50 mg chip requires approximately 700 m of silanc solution. Each chip was 

20 flamed to remove any surface contaminants during iis manufacture and 
dropped into the silane soiulion. Tlie tube containing the chip was placed 
under an argon environment and shaken for approximately three hours. 
After this time, the silane solution was removed and the chip.<; were washed 
three limes with toluene and three times with dimethyl sulfoxide {DMSO}. 

25 A 10 mM solution of N-succinimidyl(4-iodoacc.yl)aniinobenzoate (SIAB) 
(Pierce Chemical Co.; Rockford. IL) was prepared in anhydrous DMSO and 



58 

added to the tube containing a chip. Tubes were shaken under an argon 
environment for 20 minutes. The SIAB solution was removed and after 
three washes with DMSO. the chip was ready for attachment to 
oligonucleotides. 

5 Some oligonucleotides were labeled so the efficiency of 

attachment could monitored. Both 5' disulfide containing 
oligodeoxynucleotides and unmodified oHgodeoxynucieotides were 
radiolabeled using terminal deoxynucleotidy] transferase enzyme and 
standard techniques. In a typical reaction, 0.5 mM of disulfide-containing 
10 oligodeoxynucleotide mix was added to a trace amount of the same species 
that had been radiolabeled as described above, TTiis mixture was incubated 
with dithiothreitol (DTT) (6.2 pmol, 100 mM) and 
ethylenediaminetetraacetic acid (EDTA) pH 8.0 (3 jimol, 50 mM). EDTA 
served to chelate any cobalt that remained from the radiolabcUng reaction 
1 5 thai would complicate the cleavage reaction. The reaction was allowed to 
proceed for 5 hours at 37»C. With Ihe cleavage reaction essentially 
complete, the free thiol-containing oligodeoxynucleotide was isolated using 
a Chromaspin- 1 0 column. 

Similarly, Tr!s-(2-carboxyethyl)phosphine (TCEP) (Pierce 
20 Chemical Co.; Rockford, IL) has been used to cleave the disulfide. 
Conditions utilize TCEP at a concentration of approximately 1 00 mM in pH 
4.5 buffer, it is not necessar>' to isolate the product following the reaction 
since TCEP does not competitively react with Ihe iodoacetyl functionality. 

To each chip which had been deriyatized to contain the 
25 iodoacetyl functionality was added to a 10 ^xM solution of the 
oligodeoxynucleotide at pH 8. The reaction was allowed to proceed 



59 

overnight at room temperature. In this manner, two different 
oligodcoxynucleotides have been examined for their ability to bind to the 
iodoacetyl silicon wafer. The first was the free thio! containing 
oligodcoxynuclcotide already described, in parallel wi«h the free thiol 
5 containing oligodeoxynucleotide reaction, a negative control reaction has 
been perfomied that employs a 5' unmodified oligodeoxynucleotide. This 
species has similarly been 3' radiolabeled, but due to the unmodified 5' 
terminus, the non-covalent. non-specific interactions may be determined. 
Following the reaction, the radiolabeled oligodeoxynucleotides were 
10 removed and the chips were washed 3 times wtth water and quantitation 
proceeded. 

To determine the efficien<^ of attachment, chips of the water 
were exposed to a phosphorimager screen (Molecular Dynamics). This 
exposure usually proceeded overnight, but occasionally for longer periods 
.1 5 of time depending on the amount of radioactivity incorporated. For each 
different oligodeoxynucleotide utilized, reference spots were made on 
polystyrene in which the molar amount of oligodeoxynucleotide was known. 
These reference spots were also exposed to the phosphorimager- screen. 
Upon scanning the screen, the quantity (in moles) of oligodeoxynucleotide 
20 bound to each chip was detennined by comparing the counts to the specific 
activities of the references. Using the weight of each chip, it is possible to 
c^culate the area of the chip: 

tg of chip) ( U 30 mmVg) - x mm- 
By incorporating this value, (he amount of oligodeoxynucleotide boand to 
25 each chip ma>' be reported in fmol/mm^. It is necessary to divide th [s value 
by t^^-o since a radioactive signal of «P is strong enough to be read through 



BiNtSDOCID: <AU 



.758454B2.J_> 



60 

the silicon wafer- Thus the instrument is essentiaijy recording the 
radioactivity fi-om both sides of the chip. 

Following the initial quantitation each diip was washed in 5 
X SSC buffer (75 mM sodium citrate, 750 mM sodium chloride, pH 7) with 
5 50% fomiamide at 65 for 5 hours. Each chip was washed three times 
with warm water, the 5 x SSC wash was repeated, and tlie chips 
requajilitated. Disulfide hnked oligonucleotides were removed from the 
chip by incubation with 100 mM Drr at 37^C for 5 hours. 
Example 6 Attachment of Nucleic Acids to S^rept^via jn Coaf erf SnhH 

Immobilized single-stranded DNA targets for soiid^phase 
DNA sequencing were prepared by PGR amplification. PCR was performed 
on a Perkin Elmer Ceuis DNA Thermal Cycler using VenlR (exo ) DNA 
polymerase (New England Bioiabs; Beverly, MA), and dNTP solutions 

5 (Promega; Madison, WI). EcoR I digested plasmid NB34 (a PCR^''^ H 
piasmid with a one kb target anonymous human DNA insert) was used as 
the DNA template for ampitfication. PCR was performed with an' 18- 
nueleotide upstream primer and a downstream 5 '-end biotinyiated IS- 
nucleotide primer. PGR amplification was earned out in a 1 00 m1 or 400 \i\ 

0 volume containing 10 mM KCl, 20 mM Tris-HCi (pH S,8 at 25 =^0). 1 0 mM 
(NH4)2SO,. 2 mM MgSO,, 0,]% Triton X-JOO, 250 dNTPs, 2.5 ^jtM 
biotinyiated primer, 5 non-biotinylated primer, less than 100 ng of 
plasmid DNA, and 6 units of Veni {exo') DNA polymerase per 100 pi of 
reaction volume. Thirty temperature cycles were performed which included 
a heat denaturation step at 94*^0 for I minute, followed by annealing of 
pnm^Ts to !he template DNA for 1 minute at 60 ^C, and DNA chain 



61 

extension with Vent (exo) polymerase for 1 minute at 72 X. For 
amplification with the tagged primer, AS'C was selected for primer 
annealing. The PGR product was purified through a Ultrafrse-MC 30.000 
NMWL filler unit (Miiiipore; Bedford, MA) or by electrophoresis and 
5 extraction from a low melting agarose gel. About 10 pmol of pwified PGR 
fragment was mixed with 1 mg of prewashed magnetic beads coated with 
strcptavidin (Dynabeads M280, Dynal, Norway) in 100 jil of 1 M NaCl and 
TE incubating at 37°C or 45X for 30 minutes. 

The magnetic beads were used directly for double stranded 
1 0 sequencing. For single stranded sequencing, the immobilized biotinyiated 
double-stranded DNA fragment was converted to single-slranded form by 
treating with freshly prepared 0.1 M NaOH at room temperature for 5 
minutes. The magnetic beads, with immobilized singlc-stranded DNA. were 
washed with 0. 1 M NaOH and TE before use. 
15 Example 7 Hvbridiy^finn Spf^rifiyity 

Hybridization was performed using probes with five and six 
base pair overhangs, including a five base pair match, a five base pair 
mismatch, a six base pair match, and a six base pair mismatch. These 
sequences are depicted in Table 3. 



62 



Table 3 
Hybridized Test Sequences 

5 bp overlap, perfect match: 

3*-CTA CTA GGC TGC GTA GTC ^SEO n ^^o 7 

5'-biotin.GAT GAT CCG ACG CAT CAG AGC TC^T l] 

5 bp overlap, mismatch at 3' end: 

m ^'-TCGAGAACCTTGGCTM' fSEO ID NO n 

10 3'-CTACTAGGCTGCGTAGTC rSFO HMHO 

r-biolin-GATOATCCGACGCATCAGAGCTrO' CSEq{dN04 

6 bp civerlap, perfect match: 

S -TCG AGA ACC rro GCT*-5' fSEOmNOn 

3--CTACTAGGCTGCGTAGTC SFn nw^-!> 

IS 5-biotin-GATGATCCGACGCATCAGAGCTCT-3' {SEQIDNOS 
6 bp overlap, mismatch four bases from 3' end: 

3'-TCG AGA ACe TTG OCT'-y fSFA in wr, 1 1 

3'.CTACTAGGCTGCGTAGTC SEO 3 

5'-Wo.it^ATGATCCGACGCATCAOAGITCT.3> SIdnOc! 

The biotinylated double-stranded probe was prepared in TE 
buffer by annealing the complimentary single strands together at 68 "C for 
five minutes followed by slow cooiing to room temperature. A five-foid 

25 excess of monodisperse, polystyrene-coated magnetic beads (Dynai) coated 
with streptavidin was added to the double-stranded probe, which as then 
incubated with agitation at room temperature for 30 minutes. After ligaiion, 
the samples were subjected to two cold (4 °C) washes followed by one hot 
(90'C) wash in TE buffer (Figure 10). The ratio of ^^P in the hoi 

30 supernatant to the total amount of ^^P was determined (Figure 11 ). At high 
NaCl concentrations, mismatched target sequences were either not annealed 
or were removed in the cold washes. Under the same conditions, the 
matched target sequences were annealed and ligatcd to the probe. The final 
hot wash removed the non-biotinylated probe oligonucleotide. This 



0 



63 

oUgonucJeotide contained the labeled target if the target had been ligated to 
(he probe. 

Example 8 gampensatinp for VarffHon, in Ras^ Cr.T^^.i^^^ 

The Dependence on on base composition, and on base 
sequence may be overcome with the itse of salts like fetramethyl ammonium 
haiides or bctaines. Ahematively, base analogs like 2,6-diaxrino purine and 
5-bromo U can be used instead of A and T, respectively, to increase the 
stability of A-T base pairs, and derivatives like 7-deazaG can be ussd to 
decrease the stability of G-C base paire. The initial Experiment shown in 
Tabic 2 indicate that the use of enzymes wUI eliminate many of the 
complications due to base sequences. This gives the approach a very 
significant advantage over non-enzymatic methods which require different 
condhions for each nucleic acid and are highly matched to GC content. 

. Another approach to compensate for differences in stability is 
to vaiy the base next to tiie stacking she. Experiments were pcrfoimed lo 
test the relative effects of all four bases in this position on overall 
hybridization discrimination and also on relative ligation discrimination 
other base analogs such as dU (deoxyuridine) and 7-deazaG may also be 
useful to suppress effects of secondary structure. 
Example 9 D HA ]Aminn to OliPnni.H^tit < r Arrn vi 

CO// and T4 DNA ligases can be used to covalemly attach 
hybridized target nucleic acid to the con-ect immobilized oligonucleotide 
probe. This is a highly accurate and efficient process. Because ligase 
absolutely requires a correctly base paired 3" tenninus, ligase will read only 
the 3'-terminaI sequence of the target nucleic acid. After ligation, the 
resuhing duplex will be 23 base pairs long and it will fae possible to remove 



64 



unhybridized, uniigated target nucleic acid using fairly stringent washing 
conditions. Appropnately chosen positive and negative controls 
demonstrate tiie specificity of this method, such as arrays which are lacking 
a 5'-terminaI phosphate adjacent to the 3* overhang since these probes will 
5 not ligate to the target ntK^leic acid. 

There are a number of advantages to a ligation step. Physical 
specificity is supplanted by enzymatic specificity. Focusing on the 3' end 
of the target nucleic also minimize problems arising jfrom stable secondary 
stnicEures in the target DNA. DNA liases are also used to covaJently attach 
10 hybridized target DNA to the correct immobilized oligonucleotide probe. 
Several tests of the feasibility of the ligation method shown in Figure 12. 
Biotinyiated probes were attached at 5* ends (Figure 12 A) or 3' ends (Figure 
12B) to streptavidin-coated magnetic microbeads, and annealed with a 
shorter, complementary^ constant sequence lo produce duplexes with 5* or 
15 6 base single-stranded overhangs. ^^P-end labeled targets were allowed to 
hybridize to the probes. Free targets were removed by capturing the beads 
with a magnetic separator. DNA ligase was added and ligation was aUow cd 
to proceed at various salt concentrations. The samples were washed at room 
temperature, again manipulating the immobilized compounds with a 
20 magnetic separator to remove non-ligated material. Finally, samples were 
incubated at a temperature above the of the duplexes, and el u ted single 
strand was retained after the remainder of the samples were removed by 
magnetic separation. The eluate at this point consisted of the ligated 
material. The fraction of ligaUon was estimated as the amount of ^-P 
25 recovered in the high temperature wash versus the amount recovered in both 
the high and low temperature washes. Results indicated that salt conditions 



10 



65 

can be found where the ligation proceeds efficiently with perfectly matched 
5 or 6 base overhangs, but not with G-T mismatches. The results of a more 
extensive set of similar experiments are shown in Tables 4-6. 

Table 4 looks at the ejffect of the position of the mismatch and 
Table 5 escamines the effect of base composition on the relative 
discrimination of perfect Enatches verses weakly destabilizing mismatches. 
These data demonsti-ate that effective discrimination between perfect 
matches and single mismatches occurs with all five base overhangs tested 
and that there is little if any effect of base composition on the amount of 
ligation seen or the effectiveness of match/mismatch discrimination. Thus, 
the serious problems of dealing with base composition effects on stability' 
seen in ordinary SBH do not appear to be a problem for positional SBH. 
Fuitliermore, as ihe worst mismatch position was the one distal from the 
phosphodiester bond fonned in the ligation reaction, any mismatches that 
15 survived in this position would be eliminated by a polymerase extension 
reaction. A polymerase such as Sequenase version 2, that has no 3'- 
endonucleasc activity or tmninal transferase activjt>' would be useflil in this 
regard. Gel electrophoresis analysis confirmed that the putative ligation 
products seen in these tests were indeed the actual products synthesized. 

Table 4 

Ugation Efficiency of Matched and Mismatched Duplexes 
in 0.2 M NaCl at 37'C 



25 



(SEQ !D NO 1) 3VTCG AGA ACQ TTG GCT-5 



CTA CTA GCC TGC GTA CTC-S' SEOIDNO ^ ) 

5'*B- CATGATCCGACGCATCAGAGCTC o 170 fSE0fDNO3^ 

30 it ^^I^GATCCCACOCATCAGAGCTA o,006 (SEQ ID NO 7 

3t> 5-8^ GATGATCCGACGCATCAGAGCCC 0 002 (SeOIDNOS 

5"-B- CATGATCCGACOCATCAGAGTTC o:oo, SEQ ^D NO 



66 

GAT GAT CCG ACG CAT CAG AAC TC 



0.00! (SEQIDNO JO) 



Tables 

LigatioD EOkiciicy of Matched and Mismatched Duplexes io 
0-2 M NaCi at 37*^0 and its Dependance on AT Contetit of the 
Overhang 



Overhang SleqiiPnr.Pc 



Match 
Mismatch 

Match 
Mismatch 

Match 
Mismatch 

Match 
Mismatch 

Match 
Mismatch 

Match 
Mismatch 



GGCCC 
GGCCT 

AGCCC 
AGCTC 

AGCTC 
AGCTT 

AGATC 
AGATT 

ATATC 
ATATT 

ATATT 
ATATC 



AT Content 

0/5 

1/5 
2/5 
3/5 



4/5 



5/5 



Ligation RfTlct^n^y 
030 



0.03 



0.02 



0.01 



0.01 



0.0 i 



0.02 



036 



0.17 



0.24 



0,17 



031 



67 



Table 6 

increasing Biscrimitiatiart by Sequencing Ejctension at ST'^C 

5 Lte^ig^n fiffigjencv Ligation Ryt^ n ^ ^r. ^ 

. (percent) M /,■) 

(SBQ ID NO I J y-rCG AG A ACC TTG 

CTA CTA GGC TGC GTA OTC-5' (SEQ ID NO 2) 

GATGATCCGACGCATCAO AGATC 0.24 d934 29 500 

10 (SEQTDNOll) ' 

5'-B- GATGATCCGACGCATCAGAGCTT Mi 116 7sn 

CSEQIPN04) lis 

Discrimination- x;24 x42 xltg 

15 <SEQ J D N 0 ] ) 3 '-TCG AGA ACC TTG GCT^5 

CTA CTA GGC TGC GTA GTC-5* (SEQ ID NO 2) 

5^B» GATGATCCGACGCATCAG ATATC OAl 12 250 25^00 

(SEQ ID NO J 2) 

5'.B- GATGATCCGACGCATCAGATATTSL21 240 390 

20 (SEQ iD NO 13) ^ ^ 

Discrimmatjon - xl7 x5l x65 



25 



30 



35 



The discrimination for the correct sequence is not as great with 
an externa! mismatch (which would be the most difficuU case lo 
discriminate) as with an internal mismatch (Table 6). A mismatch right at 
the ligation point would presumably offer the highest possible 
discrimination. In any event, the results shown are very promising. Already 
tliere is a ievel of discrimination with only 5 or 6 bases of overlap that is 
better than the discrimination seen in conventional SBH with 8 base 
overlaps. 

Example !0 Capture and Segyencm^ of a 1 ariref NucA^\cj^ 

A mixture of target DNA was prepared by mixing equai molar 
ratio of eight different oUgos. For each sequencing reaction, one specific 
partially duplex probe and eight different targets were used. The sequence 
of the probe and the targets are shown in Tables 7 and 8. 



68 



Table? 
Dupicx Probes Used 

(DF25) S'-F-GATGATCCOACGCATCA GCTGTfi 
5 3 '-CTACTAGGCTGCGTAGTC 

CDF37) S^-F-GATGATCCGACGCATCAC TCAA C 
3 '-CTACTAGGCTGCGTAGTO 

10 {DF22) 5'-F-GATGATCCGACGCATCAG AATGT 
3 '-CTACT AG GCTQCOTAGTC 

(DF2S) 5*^F-GATGATCCGACGCATCAG£CXAG 
3*-CTACTAGGCTGCGTAGTC 

15 

(DF36) 5VF-QATaATCCGACaCATCAG TCGAC 
3 ' -Cr ACTAGGCTG CGT AGTC 

(DF i 3 a) 5 '-F>GATGATCCGACGCATCA CAGCTC 
20 3'-CTACTAGGCTGCGTAGTG 

(DFga) 5--F-GATGATCCGACGCATCAAGGCCC 
3'-CTACTAGGCTOCGTAGTT 



25 



30 



(SEQ ID NO 14) 
(SEQ ID NO 2) 

(SEQ ID NO J 5) 
(SEQ ID NO 2) 

(SEQ ID NO 16) 
(SEQ ID NO 2> 

(SEQ ID NO J 7) 
(SEQ ID NO 2) 

(SEQ ID NO 18) 
(SiEQ JD NO 2) 

(SEQ ID NO 19) 
(SEQ ID NO 2) 

(SEQ ID NO 20) 
(SEQ ID NO 2) 



Match 
CNB4} 

rNB4.5) 
(DF5) 



(TSIO) 
35 fNB3J0) 

fNB3.4) 
rNB3.7> 
40 fNS3.9) 



Table 8 
Mixture of Targets 



3'-X1M;ACCCX;aTCGAGCCGGGTCGATCTAG (DF22) 

-s. (SEQIDN02n 
3^ailAICGACCGGGTCGATCTAG (DF2S} (SEQ IDNO 

3'-£.QCieCCGGATCGAGCCGGGTCGATCTAC (DF36) 

(SEQ ID NO 23) 

3 ^^ILG^AACCTTGGCT (Dfl la) (SEQ ID NO 2^> 

3 *CCQ£KiTCaATCTAG (DF8a) (SEQ ID NO 25) 



3^-C£aGATCAAGCCGGGTCGA'rCTAG (DF8a) (SEQ ID NO 26) 
3'-ICA6GCCOGGTCGATCTAG (DFl la) (SEQ ID NO -"7) 

3 ^-AgCCGGGTCG ATCT AG (DF36) (SBQ]DN0 2g1 



Two pmol of each of the vwo duplex-probe-fofminfi 
oligonucieoiides and 1.5 pmol of each of ihc eighi difTerent targets were 
45 mixed in a 10 m1 volume containing 2 ^il of Sequenase buflfer slock (200 mM 



69 



Tris^HCl, pH 7.5, 100 mM MgCl,, and 250 mM NaCJ) from the Sequeneise 
\dt The aimealing mixture was heated to 65 X and allowed to cool slowly 
to room temperature. While the reaction mixture was kept on ice, i |il 0 J 
M dithiothreitol solution, 1 |il Mn bufler (0 J 5 M sodium isocitrate and 0. 1 
M MnCy, and 2nl of diluted Sequenase (L5 units) were mixed, and the 2 
fi] of reaction mixture was added to each of the four termination mixes at 
room temperature (each consisting of 3 nl of the appropriate termination 
mix: 16 nM dATP, 16 dCTP, 16 dGTF, 16 dTTP and 3.2 ^iM 
of one of the four ddNTPs. in 50 mM NaCl). The reaction mixtures were 
further incubated at room temperature for 5 minutes, and terminated with the 
addition of 4 of Pharmacia stop mix (deionized formamide containing 
dextran blue 6 mg/mi). Samples wa-c denatured at 90^-95 °C fc^ 3 minutes 
and stored on ice priw to loading. Sequencing samples- were analyzed on 
an ALF DNA sequencer (Pharmacia Biotech; Piscataway, NJ) using a ] 0% 
polyacrylamide gel containing 7 M urea and 0.6 x TBE. Sequencing results 
from the gel reader are shown in Figure 13 and summarized in Table 9. 
Matched targets hybridized correctly and are sequenced, whereas 
mismatched targets do not hybridize and are not sequenced. 



70 



Tabled 

SutDmary of Hybridization Data 
ficftctim Hybrid i^^tign Sequence Comment 



1 Probe: DF25 Target: mixture No mismatch 

2 Probe: DF37 Target: mixture No mismatch 

3 Probe: DF22 Target: mixture Yes match 

4 Probe: DF28 Target: mixture Yes match 

5 Probe: DF36 Target: mixture Yes . match 

6 Probe; DF 11 a Target: mixture Yes match 

7 Probe: DFSa Target: mixture Yes match 
■8 Probe: DF8a Target: N^3.4 No mismatch 

9 Probe: DFSa Target: TS12 No mismatch 

10 Probe: DF37 Target: DF5 No mismatch 



Example 11 Elon^ati^ n of Nucleic Acids Bound to SoliH Si.p p^rtc 

Elongation was carried out either by using Sequenase version 
2.0 kit or an AutoRead sequencmg kit (Phaimacia Biotech; Piscataway, NJ) 
employing T7 DNA polymerase. EJongation of the immobilized single- 
stranded DNA target was pcrfomned witJi reagents from the sequencing kits 
for Sequenase Version 2,0 or T7 DNA polymerase. A duplex DNA probe 
containing a $-base 3' overhang was used as a primer. The duplex has a 5 ^ 
fluorescein labeJed 23-mer, containing an 18-base 5' constant region and a 
5-base 3' variable region {which has the same sequence as the 5'-eiid of the 
corresponding nonbiotinylated primer for PCR amplification of target DNA, 
and an 18-mer complementary to the constant region of the 23^mer. The 
duplex was formed by annealing 20 pmol of each of the two 
oligonucleotides in a 10 m1 volume containing 2 ^ I of Sequenase buffer 
stock (200 mM Tris^HCi, pH 7.5, 100 mM MgCU. and 250 mM NaCI) &om 
the Sequenase kit or in a 13 ^1 volume containing 2 ^1 of the annealing 
buffer <l M Tris-HCL pH 7.6, 100 mM MgCK) from ihe AutoRead 



71 

sequencing kit. The annealing mixture was heated to 65 ^^C and allowed to 
cool slowly to 37''C over a 20-30 minute time period. The duplex primer 
was annealed with the immobilized single-stranded DMA target by adding 
the annealing mixture to the DNA-containing magnetic beads and the 
5 resulting mixture was further incubated at BT-^C for 5 minutes, room 
temperature for 10 minutes, and "finally O'^C for at least 5 minutes. For 
Sequenasc reactions, 1 ^1 0.1 M dithiothreito! solution, I ^jl Mn buffer (0. ] 5 
M sodium isocitrate and 0,1 M MnCl^) for the relative short target, and 2 jil 
of diluted S^iuenase (1.5 units) were adcfed, and die reaction mixtwe was 
1 0 divided into four ice coEd termination mixes (each consists of 3 jil of the 
appropriate termination mix: 80 fxM dATP, 80 \iM dCTP, 80 dGTP, 80 
dTTP and 8 of one of the four ddNTPs, in 50 mM NaCJ). For T7 
DNA polymerase reactions, 1 ^l of extension bulTer (40 mM McCi^, pH 7.5, 
304 mM citric acid and 324 mM DTT) and I pi of T7 DNA polymerase (8 
15 units) were mixed, and the reaction volume was split into four ice cold 
termination mixes (each consisting of 1 pi DMSO and 3 p] of the 
appropriate termination mix: I uM dATP, 1 niM dCTF, 1 mM dGTP, ! 
mM dTTP and 5 ^M of one of the four ddNTPs, in 50 mM NaCI and 40 mM 
Tris-HCl, pH 7.4), Hie reaction mixtures for both enzymes were further 
20 incubated at 0*C for 5 minutes, room temperature for 5 minutes and Zl^C 
for 5 minutes. After the completicHi of extension, the supernatant was 
removed and the magnetic beads were re-suspended in 10 pi ofPharmacia 
stop mix. Samples were denatured at 90-95 X for 5 minutes {under this 
harsh condition, both DNA template and the dideoxy fragments are released 
25 from the beads) and stored on ice prior lo loading. A conirol experiment 
was performed in parallel using a i 8-mer complementary to the 3^ end of 



■ 72 

target DNA as the sequencing primer instead of the duplex probe and the 
amiealing of 18-mcr to its target was earned out in a similar way as the 
annealing of the duplex probe. 

Example 12 Chain Elongation of Target Sequences. 
5 Sequencing of immobilized target DNA can be performed 

with Sequenase Version 2.0. A total of 5 elongation reactions, one with 
each of 4 dideoxy nucleotides and one with ail four simultaneously, are 
performed. A sequencing solution, containing {40 ntiM Tris-HCl, pH 7.5, 
20 mM MgCfj, and 50 mM NaCl. 10 mM dithiothreitoJ solution, IS mM 

1 0 sodium isocitrate and 10 mM MnClj, and 1 00 u/ml of Sequenase (1 .5 units) 
is added to the hybridized target DNA. dATF, dCTP, dGTP and dTTP are 
added to 20 \xhA to initiate the elongation reaction. In the separate reactions, 
one of four ddNTP is added to reach a concentration of g In the 

combined reaction ail four ddNTP are added to the reaction to 8 each. 

15 The reaction mixtures were incubated at O^^C for 5 minutes room 
temperamre for 5 minutes and 37 ''C for 5 minutes. After the completion of 
extension, the supernatant was removed and the elongated DNA washed 
with 2 mM EDTA to terminate elongation reactions. Reaction products are 
analyzed by mass spectrometry. 



73 

Example 13 C^Pilifm^ F.lrrtronhnmtiV An^R -.k of T.r p.^ x^.f^^uj. 

Molecular weights of target sequences may also be determined 
by capillary electrophoresis. A single laser capillary electrophoresis 
iafrtniment can be used to monitor the performance of sample preparations 
5 i'T i gh perfonnance capillary electrophoresis sequencing, ITiis instrument 
is :i<;>;igned so that it is easily converted to muhipie channel (wavelengths) 
detection. 

An individual element of the sample airay may be engineered 
directly to serve as the sample input to a capillary. Typical capillaries are 
10 250 microns o.d. and 75 microns i.d. The sample is heated or denatured to 
release the DNA ladder into a liquid droplet, the silicon array surfaces is 
ideal for this purpose. The capillary can be brought into contact with the 
droplet to load the sample. 

To facilitate loading of large numbers of samples 
15 simultaneously or sequentially, there are two basic methods. With. 250 
micron o.d. capillaries it is feasible to match the dimensions of the target 
array and the capillary array. Then the two could be brought into contact 
manually or even by a robot arm using a jig to assure accurate alignment. 
An electrode may be engineered directly into each sector of the silicon 
20 surface so that sample loading would only require contact between the 
surfecc and the capillary array. 

The second method is based on an inexpensive collection 
system to capture fractions eluted from high performance capillary 
electrophoresis. Dilution is avoided by using designs which allow sample 
25 collection without a perpendicular sheath flow. The same apparatus 
designed a.s a sample collector can also serve inversely as a sample loader. 



74 

In Ms case, each row of the sample array, equipped with electrodes, is used 
directly to load samples automatically on a row of capillaries. Using either 
method, sequence information is determined and the target sequence 
constructed. 

Examples Mas^ Snectrometrv of Nucleic Acids . 

Nucleic acids to be analyzed by mass spectrometr}' were 
redissolved in uJtrapure water (MilHQ, Milliporc) using amounts to obtain 
a concentration of 10 pmoles/^1 as stock solution. An aliquot (1 ^l) of this 
concentration or a dilution in ultrapure water was mixed with 1 of the 
matrix solution on a flat metal surface ser^^ing as the probe tip and dried 
with a fart using cold air, in some experiments, cation-ion exchange beads 
in the acid form were added to the mixture of matrix and sample solution to 
stabilize ions formed during analysis. 

MALDI-TOF spectra were obtained on different commercial 
instruments such as Vision 2000 (Finnigan-MAT), VG TofSpec (Fisons 
Instruments), LaserTec Research (Vestefc). Tne conditions were linear 
iKigative ion mode with an acceieration voltage of 25 kV. Mass calibration 
was done externally and generally achieved by using defmcd peptides of 
appropriate mass range such as insulin, gramicidin S, trypsinogen, bovine 
scrum albumen and cytochn>me C. Alt spectra were generated by 
employing a nitrogen laser with 5 nanosecond puiscs at a wavelength of 337 
nm. Laser energ}- varied between !0^ and 10^ W/cm^ To improve signal- 
to-noise raTio generally, the intensities of 10 to 30 laser shots were 
accumulated. The output of a typical mass specirometry showing 
discrimination bervveen nucleic acids which differ by one base is shown in 
Figure 14. 




75EH54B2_J. > 



76 

Exampk 15 Se flttgnce Determination from Mass Spectrom^ tr;,', 

Elongation of a target nucleic acid, in the presence ofdideoxy 
chain lemiinating nucleotides, generated four families of chain-termmated 
fragments. The mass difference per nucleotide addition is 289.19 for dpC, 
313.21 for dpA, 329.21 for dpG and 304^20 for dpT, respectively. 
Comparison of the mass differences measured between l&agments with the 
known masses of each nucleotide the nucfeic acid sequence can be 
determined. Nucleic acid may also be sequenced by performing polymerase 
chain elongation in four separate reactions each with one dideoxy chain 
terminating nucleotide. To examine mass differences, 13 oligonucleotides 
from 7 to 50 bases in length were analyzed by MALDI-TOF mass 
spectrometry. The correlation of calculated molecular weights of the ddT 
fragments of a Sanger sequencing reaction and their experimentally verified 
weights are shown in Tabic 10. When the mass spectrometry data from a^J 
four chain termination reactions are combined, the molecular weight 
difference between two adjacent peaks can be use to determine the 
sequence. 



77 



Table 10 

Summary of Molecular Weights Ejcpected v. Measured 



Fra^ent (n^f^^r) Calculated ] 
5 7-mer 2104.45 

3011,04 

11-mer 3315^4 
.19-mer 577L82 
20-mer 6076.02 
10 24-Tner 731LS2 
26-mer 7945.22 
10112.63 

37- mer 11348.43 

38- mer 1.1652.62 
15, 42-mer 12872.42 

46-mer 14108.22 
50-mer 15344.02 



20 



2119-9 
3026.1 
3330.1 
.. 5788.0. 
6093-8 
7374.9 
7960,9 
10125.3 
11361.4 
11670.2 
12888.3 
H125.0 
15362,6 



Difiference 
+ 15.4 
+ 15.1 
+14,9 
+ 16.2 
+ 17.8 
+63.1 
+ 15.7 
+ 12.7 
+ 13.0 
+ 17,6 
+15.9 
+ 16.8 
+18.6 



Example 16 Reduced Pas s Sequencing , 

To maximize the use of PSBH arrays to produce Sanger 
ladders, the sequence of a target should be covered as completely as p<^sible 
with the lowest amount of initial sequencing redundancy. This will 
maximize the performance of individual elements of the arrays and 
maximize the amount of useful sequence data obtained each time an array 
25 is used. With an. unknown DNA, a full array of 1024 elements (Mwo J or 
BsiV I cieavage) or 256 elements (TspR I cleavage) is used. A 50 kb target 
DNA is cut into about 64 fragments by Mwo I or BsiY 7 or 30 fragments by 
TspR /, respectively. Each fragment has two ends both of which can be 
' captured independently. The coverage of each array after capture and 
30 ignoring degeneracies is 128/1024 sites in the first case and 60/256 sites in 
the second case. Direct use of such an array to blindly deliver samples 



78 

element by element for mass spectrometry sequencing would be inefficient 
since most airay elements will have no samples. 

In one method, phosphatased double-stranded targets are used 
at high concentrations to saturate each array eicment that detects a sample. 
5 The target is iigated to make the capture irreversible. Next a different 
sample mixture is exposed to the array and subsequently iigated in place. 
This process is repeated four or five times unti! most of the elements of the 
array contain a imique sample. Any tandem target-target complexes will be 
removed by a subsequent iigating step because ali of the targets are 
10 phosphatased' 

Alternatively, the array may be monitored by confocal 
microscopy after the elongation reactions. This reyeals which elements 
contain elongated nucleic acids and this information is communicated to an 
automated robotic system that is ultimately used to load the samples onto a 
1 5 mass spectrometry analyzer. 

Example !7 Synthesis ofMass Mndi fied Nuelgic Acid Prir^ ^pi 

Mass modification at the 5' sugar: Oligonucleotides were 
synthesized by standard automated DNA synthesis using 
cyanoethylphosphoamidites and a 5'-amtno group introduced at the end of 
20 solid phase DNA synthesis. The total amount of an oligonucleotide 
synthesis, starting with 0,25 micromoles CPG^bound nucleoside, is 
deprotected with concentrated aqueous ammonia, purified via OligoPAK"^^ 
Cartridges (Millipore; Bedford, MA) and lyophilized. This maieria! with a 
5 -terminal amino group is dissolved in 100 ^1 absoluie N, 
25 dimethyl formamide (DMF) and condensed with 10 A/mole N'-Fmoc- glycine 
pentafluorophenyl ester for 60 minutes at 25°C. After ethanol precipitation 



79 

and centrifugaiion, the Fmoc group is cleaved off by a 10 minute treatment 
with 100 ]x[ of a solution of 20% piperidine in N.N^dimethylformamide. 
Excess piperidine, DMF and the cleavage product from the Fmoc group are 
removed by ethanol precipitation and the precipitate iyophiiized from 10 
5 mM TEAA buffer pH 7.2. This material Is now either used as primer for the 
Sanger DNA sequencing reactions or one or more glycine residues (or other 
suitable protected amino acid active estei^) are added to create a seri«; of 
mass-modified primer oligonucleotides suitable for Sanger DNA or RNA 
sequencing. 

0 Mass modification at the heterocyclic base with glycine: 

Starting material was 5-(3~aminopropynyI.I)0'5'-di^p-^tolyideoxyuridine 
prepared and 3' 5'-de-0-acy]ated (Haralambidis et al., Nuc. Acids Res. 
15:4857-76, 1987), 0.28i g (LO mmolj 5.(3-aminopropynyM>2^ 
deoxyuridine were reacted with 0.927 g (2.0 nimol) N-Fmoc-giycine 
5 pentafluorophenylester in 5 ml absolute N,N-dimethylformaraide in the 
presence of 0J29g (1 mmol; 174 pf) N^N-diisopropyiethyi amine for 60 
minutes at room temperature. Solvents were removed by rotary evaporation 
and the product was purified by silica gel chromatography (Kieselgel 60, 
Merck; column: 2.5 x 50 cm, elution with chloroform/methanol mixtures). 
0 Yield was 0,44 g (0,78 mmol; 7B%1 To add another glycine residue, the 
Fmoc group is removed with a 20 minutes treatment with 20% solution of 
piperidine in DMF, evaporated m vacuo and the remaining solid materia) 
extracted three times with 20 ml ethylacetate. After having removed the 
rcmainino eihyiacetace. N-Fmoc^glycine pentaOuoropheny Jester is coupled 
as described above. 5^(3(N>Fmoc-glycyI)-amidopropyny]-l)-2'-deoxyuridine 
is U*ansformed mio the 5'-0-dimethox\iritylated nucleoside-B'-^O-B- 



80 

cyanoethyJ-RN-diisopropylphosphoamidite and incorporated into 
autOTiatcd oligonucleotide synthesis. Hits glycine modified th>inidine 
analogue building block for chemical DNA synthesis can be used to 
substitute one or more of the thymidine/uridine nucleotides in the nucleic 
5 acid primer sequence. The Fmoc group is removed at the end of the solid 
phase synthesis with a 20 minute treatment with a 20% solution of 
piperidine in DMF at room temperature. DMF is removed by a washing 
step with acetonitriie and the oHgonucleotide deprotected and purified. 

Mass modification at the heterocyclic base with ^-alanine: 
0 0.281 g (l.D ^mmol) 5-(3-AminopropynyM>2'-deoxyuridtne was reacted 
with N-Fmoc-B-alanine pentafluorophenyiester (0.955 g; 2.0 mmol) in 5 ml 
HN-dimethyifonmamide (DMF) in the presence of OJ29 g {174 fil; I.O 
mmol) N,N-disopropylethyiamine for 60 minutes at room temperature. 
Solvents were removed and the product purified by silica gel 
5 chromato^phy. Yield was 0.425 g (0.74 mmol; 74%). Another 6-alanine 
moiety can be added in exactly ^e same way after removal of the Fmoc 
group. The preparation of the 5 -O-dimethoxytritytatcd nucleoside-'3'.O^B- 
cyanoelhyl-'N,N*diisopropylphosphoamidtte from S-CS-^CM-Fmoc-B-alanyi)- 
amidopropynyM)-2^-deoxyundine and incorporation into automated 
0 oligonucleotide synthesis is performed under standard conditions. This 
building block can substitute for any of the thymidine/uridine residues in the 
nucleic acid primer sequence- 
Mass modification at the heterocyclic base with ethylene 
moEiomcthyl ether: 5-{3-aminopropyny]-I)-2''deoxyundine was used as a 
5 nucleosidic component in this example. 7.61 g (iOO.O mmol) freshly 
distilled ethylene glycol monomcthyl ether dissolved in 50 ml absoiuie 



pyridine was reacted with 10.01 g (100.0 mmol) reciystallized succinic 
anhydride in the presence of L22 g {lO.O mmol) 4.N,N" 
dimethylaminopyridine overnight at room temperature. The reaction was 
tennin^ed by the addition of water (5.0 ml), the rcaction mixture evaporated 
5 in vacuo, co-evaporatcd twice with dry toluene (20 ml each) and the residue 
redissolved in iOO ml d5chloromcthane. The solution was twice extracted 
successively with 10% aqueous citric acid (2 x 20 ml) and once with water 
(20 ml) and the organic phase dried over anhydrous sodium sulfate. The 
organic phase was evaporated in vacuo. Residue was redissoived in 50 ml 
10 dichloromethane and precipitated into 500 ml peritane and the precipitate 
dried in vacuo. Yield was 13.12 g (74.0 mmol; 74%). 8.86 g (50.0 mmol) 
of succinyiated ethylene glycof monomethyi ether was dissolved in 1 00 ml 
dioxane containing 5% dry- pyridme (5 mi) and 6.96 g (50.0 mmol) 4^ 
Qitroji^enol and 10.32 g (50.0 mmol) dicyclohexylcarbodnmide was added 
15 and the reaction nin at room temperature for 4 hours. Dicyclohexylurea was 
removed by filtration, the filtrate evaporated in vacuo and the residue 
redissoived in 50 ml anhydrous DMF. 12,5 ml (about 12.5 mmoi 4- 
nitrophenylester) of this solution was used to dissolve 2.8! g (10.0 mmol) 
5-(3-aniinopropynyM>.2'-deoxyuridine. Hie reaction was performed in the 
20 presence of LOl g (JO.O mmol; J. 4 mi) triethyiamine overnight at room 
temperature. The reaction mixture was evaporated in vacuo, co-evaporated 
with toluene, redissoived in dichloromethane and chromatographcd on 
siHcagel (Si60, Merck; column 4 x 50 cm) with dichioromelhane/methanol 
mixtures. Fractions containing ihe desired compaund were coifccted, 
25 evaporated, redissoived m 25 ml dichloromelhane and precipitated into 250 
ml pentane. The dried precipitate of 5.(3.N-(0-succinyi ethylene glycol 



82 

monomethyi cthcr)-amidopropynyl-l)-2 -deoxyuridine (yield 65%) is 5'-0- 
dimetboxytrifylated and transformed into the nucleoside-3'-0-B-cyanoc^yI- 
N, N-diisopropylphosphoamidite and incorporated as a building block in the 
automated oligonucleotide synthesis according to standard procedures. The 
mass-modified nucleotide can substitute for one or more of the 
thymidine/uridine residues in the nucleic acid primer sequence. 
Deprotection and purification of the primer oiigonucleotide also follows 
standard procedures. 

Mass modification at the heterocyclic base with dtelhylene 
glycol mononietliyl ether; Nucleosidic starting material was as in previous 
examples, 5-(3-aminopropynyl-0-2'"deox>^ridine. 12.02 g (100.0 mmol) 
freshly distilled diethylene glycol monomethyi ether dissolved in 50 ml 
absolute pyridine was reacted with 10.01 g (100.0 mmol) rccrystallized 
succinic anliydride in the presence of L22 g (1 0.0 mmol) N- 
dimethylaminopyridine (DMAP) overnight at room temperature. Yield was 
IB.35 g (82.3 mmol; 82.3%). U.06 g (50.0 mmol) of succinylaied 
diethylene glycol monomethyi ether was transformed into the 4- 
nitrophenylester and, subsequently, 12.5 mmol was reacted with 2.8 1 g { 1 0.0 
mmol) of 5-(3-aminopropynyi -l)-2'-deoxyuridine. Yield after silica gel 
column chromatography and precipitation into pentane was 3,34 g (6.9 
mmoi; 69%). After dimethoxytritylation and transformation into the 
nucleoside-B-cyanoeihyiphosphoamidite, the mass-modified building block 
is incorporated into automated chemical DNA synthesis. Within the 
sequence of the nucleic acid primer, one or more of the thymidine/uridine 
residues can be substituted by this mass-modified nucleotide. 



83 

Mass Modification at the heterocyclic base with glycine: 

Starting material was N*-benzoyl-8-bromo-5'-0.(4,4*-dimethoxytrityl)-2'- 
deoxyadeno$me(Smgheta!.,Nuc.ActdsRes. 18:3339-45, 1990). 632.5iiig 
(LOminol) of this 8-bromo-deoxyadenosine derivative was suspended in 5 
5 ml absolute ethanol and reacted with 25 1.2 mg (2.0 mmol) glycine methyl 
ester (hydrochloride) in the presence of 241 .4 mg (2. 1 mmoS; 366 ^1) N,N- 
diisopropyjethylamine and refloxed until the starting nucleosidic material 
had disappeared (4-6 hours) as checked by thin layer chromatograohy 
(TLC). The solvent was evaporated and the residue purified by sihca gel 
0 chromatography (column 2.5 x 50 cm) using solvenl mixUires of 
chloroform/methano I containing 0,1% pyridine. Product fractions were 
-combined, the solvent evaporated, the fractions dissolved, in 5 ml 
dichlorometliane and precipitated into 100 ml pentane. Yield wa^ 487 mg 
(0.76 mmol; 76%). Transfomiation into the corresponding micieoside-B* 
5 cyanoethylphospho amidite and integration into automated chemical DNA 
synthesis is pcrfomied under standard conditions. During final deprotection 
with aqueous concentrated ammonia, the methyl group is removed from the 
glycine moiety. Tne mass-modified building block can substitute one or 
more deoxy adenosine/adenosine residues in the nucleic acid primer 
0 sequence. 

Mass modification at the bcterocyciic base with 

glycylglycine: 632.5 mg (LO mmol) N^-Benzoyl-g^bromo^S^O- 

(4,4'dimcethoxytniyl)2 -deoxyadenosine was suspended in 5 ml absolute 
ethanol and reacted with 324.3 mg (2.0 mmol) glycyl-gtycine mclhyl ester 
S in the presence of 241 .4 mg (2.1 mmol; 366 jul) N-diisopropylethylamine. 
The mixiure was refluxed and completeness of the reaction checked b\' 



84 

TLC. Yield afkcr silica gel column chromatography and precipitation into 
pentane was 464 mg (0.65 mmol; 65%). Transformation into the 
nucleoside-6-cyanoethy!phosphoamidjte and into synthetic oligonucleotides 
is done according to standard procedures, 

5 Mass Modincattott at the heterocyclic base with glycol 

tnoiiomethyl ether: Starting material was 5'-0-{4,4-dimetho3tytrityl)-2'- 
amino-2^deo>cythymidine synthesized (Verheyden et aL, J. Org. Chem. 
36:250-54, 1971 ; Sasaki et ai, J, Org. Chem, 41 :3 138-43, 1976; Imazawa et 
aU J. Org. Chem. 44:2039-41, 1979; Hobbs elal., J. Org. Chenv 42:714-19, 

0 1976; Ikehara et al, Chem. Phaim. Buli. Japan 26:240-44, 1978). 5^-0-{4,4- 
Dimethoxytrity[)-2'-amino-2'-deoxythymidine (559,62 mg; 1 .0 mmol) was 
reacted with 2.0 mmol of the 4-nitrophenyl ester of succinylated ethylene 
glycol monomethyl ether in 10 ml dry DMF in the presence of 1,0 mmol 
(140 ^1) triethylamine for 18 hours at room temperature. The reaction 

5 mixture was evaporated in vacuo, co-evaporated with toluene, redissoived 
in dichloromethane and purified by silica gel chromatography (Si60, Merck; 
column: 2.5 x 50 cm; eluent: chiorofbnm/methanol mixtures containing 0. 1 % 
triethylamine)* The product containing fiBCticHis were combined, evaporated 
and precipitated into pentane. Yield was 524 mg (0-73 mmol; 73%). 

0 Transformauon into the nucieosicle-6-'Cyanoethyl-N,N- 
diisopropylphosphoamidite and incorporation into the automated chemical 
DNA synthesis protocol is performed by standard procedures. The mass- 
modified deoxylhymidiae derivative can substitute for one or more of the 
thymidine residues in the nucleic actd primer. 

5 In an analogous way, by employing the 4-niirophenyi ester of 

succinylaicd diethyiene glycol monomeihyl ether and triethylene glycol 



85 

monomethyl ether, the corresponding mass-modified oiigonucleotidcs arc 
prepared. In the case of only one incorporated mass-modified nucleoside 
within the sequence, the mass difference between the ethylene, diethylene 
and triethylenc gfyco! derivatives is 44.05, 88.1 and 132.15 daltons, 

5 respectively- 
Mass nir>diricattoD at the beferocyciic base by aikyiatton : 
Phosphorothioate-containing oligonucieottdes were prepared (Gait et aL, 
Nuc. Acids Res. 1 9: 11 83, 1991 ), One, several or all intemucleotide linkages 
can be modified in this way. The (')M 13 nucleic acid primer sequence (17- 

0 mcr) 5'^dGTAAAACGACGGCCAGT (SEQ ID NO 29) is synthesized in 
0.25 Mmoie. scale on a DNA synthesizer and one phosphorothioate group 
introduced after the final synthesis cycle (G to T coupling). Sulfurization, 
deprotection and purification followed standard protocols. Yield was 3 1 .4 
nmole ( 12.6% overaii yield), coiresponding to 3 1 ,4 nmole phosphorothioate 

5 groups. Aikyiation was performed by dissolving ihe residue in 3 1 .4 /x\ T£ 
bufTer (0.01 M Tris pH 8.0, 0.001 M EDTA) and by adding 16 jul of a 
solution of 20 mM solution of 2-iodoethanol (320 nmole; 10-fold excess 
with respect to phosphorothioate diesters) in N,N-dimethylformamide 
(DMF). The alkylated oiigonucieotide was purified by standard reversed 

0 phase HPLC (KP-IS Uitraphere, Beckman; column: 4.5 x 250 mm; 100 mM 
iriethyi ammonium acetate, pH 7.0 and a gradient of 5 to 40% acetoniirile). 

In a variation of this procedure, the nucleic acid primer 
containing one or moro^ phosphorothioate phosphodiester bond is used in the 
Sanger sequencing reactions. The primer-exiension products of the four 

5 sequencing reactions are purified, cleaved off the solid support, iyophiiized 
and dissolved in 4 pti each of TE buffer pH 8.0 and alkylated by addition of 



86 

0 2 ^1 of a 20 mM solution of 2.iodoethanol in DMF. It is then analyzed by 

ES and/or MALDI mass spectrometry. 

In an analogous way, employing instead of 2-iodoethano], e.g., 
3iodopropanol, 4-iodobutano] mass-modified nucleic acid primer are 
5 obtained with a mass difference of 14.03, 28.06 and 42.03 daltons 
rcspectiveiy compared to the unmodified phosphorothioate phosphodicster^ 
containing oligonucleotide. 

Example 18 Mass Modification of Nnc ieotide Trip h^^^ph^t^^ 



BNSDOCIO: <AU 



768454B2_I..> 



8? 



Mass tnodification of nucleotide triphosphates at the 2' and 
3^ ammo function: Starting material was 2'-azido-2^^deoxyuridiite prepared 
according to Hterature (Verheyden el aL, J. Org. Chem. 36:250, 197 1), 
which was 4,4- dimethoxytritylated at 5^>OH with 4,4-dimethoxytrityi 
5 chloride in pyridine and acetylated at 3*-OH with acetic anhydride in a one^ 
pot reaction using standard reaction conditions. With 191 mg (0.71 mmol ) 
2*"a5:ido-2^dcoxyuridine as starting material, 396 mg (0.65 mmol; 90.8%) 
5^-0-(4,4-dimethoxytrityl>3^0-acetyl-2'-a2ido-2'-deoxyyridine was 
obtained after purification via silica gel chromatography. Reduction of the 
0 azido group was performed (Barta et ah, Tetraliedron 46:587-94, 1990). 
Yield of 5'-OK4,4-dimethoxytrityl)-3*.0-acety^2*-amino-2'-deoxyuridine 
after silica gel chromatography was 2S& mg (0.49 mmol; 76%). This 
protected 2 '-am ino^Z'-deoxyuridine derivative (588 mg, LO mmol) was 
reacted with 2 equivalents (927 mg; 2.0 mmol) N-Fmoc-glycine 
5 pentafluorophenyl ester in 10 ml dn- DMF overnight at room temperature 
in the presence of ] ,0 mmol (1 74 /^l) N.N^diisopropyiethylamine. S^tvents 
were removed by evaporation in vacuo and the residue purified by siiica gel 
chromatography. Yield was 71 1 mg (0.71 niniol; 82%). Detritylation was 
achieved by a one hour treatment with 80% aqueous acetic acid at room 
0 temperature. The residue was evaporated to dryness, co^evaporated tunce 
with toJucnc, suspended in 1 ml dry aceionitrile and 5^-phosphorylated with 
POCI3 and directly transformed in a one-pot reaction to the 5'-criphosphate 
using 3 ml of a 0.5 M solution {1.5 mmol) tetra (tri-n^butylammonium) 
pyrophosphate in DMF according to itterature, The Fmoc and the 3^-0- 
acetyl groups were removed by a one-hour treamieni whh concentrated 
aqueous ammonia at room temperature and ihe reaction mixture evaporated 



88 

and lyophiUzed, Purification also followed standard procedures by using 
anion-exchange chromatography on DEAE Sephadex with a linear gradient 
of triethylammonium bicarbonate (0, 1 M - LO M). Triphosphate containing 
fractions, checked by thin layer chromatography on polyethyleneimine 
cellulose plates, were collected, evaporated and lyophilized. Yield by UV- 
absorbance of the uracil moiety was 68% or 0.48 mmoL 

A glycyl-glycine modified 2'-amino-2'-deoxyuridine-5'- 
triphosphate was obtained by removing the Fmoc group from 5'0-{4,4'. 
dimetho\ytrit)'I).3'<)-acetyi-2^-N{N-9-fiuorenytaethyioxycarbony!^ 
2'''amino-2'-deoxyuridiiie by a one-hour treatment with a 20% solutim of 
piperiduie in DMF at room temperature, evaporaiion of solvents, tu^o-fold 
co-evaporation with toluene and subsequent condensation with N-Fraoc- 
glycine pentafluorophenyl esten Starting with LO mmo] of the 2-N-glycy^. 
2'-aminO''2''«deaxyuridine derivative and following the procedure described 
above, 0.72 nimol (72%) of the corresponding 2'-OI<glycyl-giycyl)-2'- 
amino-2'-deoxyuridine-5'triphosphate was obtained: 

Siartingwith5*-0-(4,4"dimethoxytrityI>'3'-0-acetyl-2-amtnO" 
2'deoxyuridine and coupling with N-Fmoc-6-alanine pentafluorophenyl 
ester, the comesponding 2*-(N-B-alanyl)^2^-amino-2'-deoxyuridine-5'* 
triphosphate arc syntiiesized. These modified nucleoside triphosphates are 
incorporated during the Sanger DNA sequencing process in the primer- 
extension products. The mass difference between the glycine, B-alantnc and 
glycyl-glycine mass-modified nucleosides is, per nucleotide incorporated. 
58.06, 72.09 and 1 15.1 daltons, respectively. 

When starting with 5^-0-(4 ,4-diniethoxytrityi)'3*-amino-2',3' 1 - 
dideoxythymtdine, the corresponding T^-glycyIV3'-amino-, S^f-N-glycyl- 



89 

glycylH '-amino-, and 3^-(N"B^aIanyI)-3^«amino'2\3^-dideoxytiiymidinc-5'- 
triphosphafes can be obtained. These mass-modified nucleoside 
triphosphates sen'e as a terminating nucleotide unit in the Sanger DNA 
sequencing reactions providing a mass difference per terminated &agment 
5 of 58.06, 72.09 and U5J daltons respectively when used in the 
multiplexing sequencing mode. The mass-differentiated fi-agments are 
analyzed by ES and/or MALDI mass spectrometry. 

Mass modification of nucleotide triphosphates at C-5 of the 
heterocyclic base: 0,281 g (1.0 mmoi) 5-{3-Aminopropyny|.l)"^2 
10 , dcoxyuridine was reacted with either 0.927 g (2.0 nimol)]^-Fmoc-g^^^^ 
pentafluorophraylester or 0.955g (2.0 mmoI) N^Fmoc-B-aianine 
pentafluorophenyl ester in 5 ml dry DMF in the presence of 0,129 g N, 
diisopropylethylamine (174 fil. l .O mmol) overnight ai room temperature. 
Solvents were removed by evaporation in vacuo and the condensation 
1 5 products purified by flash chromatography on silica gel (Stiil et al., J. Org,, 
Chem, 43: 2923-25, 1978). Yields v^ere 476 mg (0,85 mmol; 850%) for the 
glycine and 436 mg (0,76 mmol; 76%) for the B^aJanine denvativcs. For the 
synthesis of the giycyUglycinc derivative, the Fmoc group of 1 .0 mmol 
Fmoc-glycine-deoxyuridine derivative was removed by one-hour treatment 
20 with 20% piperidine in' DMF ai room temperature. Solvents were removed 
by evaporation in vacuo, the residue was coevaporated twice with toluene 
and condensed with 0.927 g (2.0 mmof) N-Fmoc-glycine pentafluoropheny] 
ester and purified as described above. Yield was 445 mg (0.72 mmol; 72%). 
The glycyl-, glycyUglycyi- and B-alanyI-2^dco>jyuridine derivatives, N- 
25 protected with the Fmoc group were transformed to the 3'-0-acetyi 
derivatives by triiyiation with 4.4»dimethoxytrityl chloride in pvTidine and 



90 

acetylation with acetic anhydride in pyridine in a one-pot reaction and 
subsequently detrityiated by one hour treatment with 80% aqueous acetic 
acid according to standard procedures. Solvents were removed, the residues 
dissolved in 100 ml chiorofonn and extracted twice widi 50 mi 10% sodium 
5 bicarbonate and once with 50 oii water, dried with sodium sulfate, the 
solvent evaporated and the residues purified by flash chromatography on 
silica gel. Yields were 361 mg (0.60 mmol; 71%) for the glycyU, 351 mg 
(0^7 mmol; 75%) for the 6-aianyl- and 323 mg {0.49 mmol; 68%) for the 
glycy|.glycyI-3-0^-acetyl-2'-deoxyuridine derivatives, respectively. 
10 Phosphorylation at the 5^0H with POCI3, transfonnation into the 5'- 
triphosphate by in situ reaction with tetraCtri^n-butylammonium) 
pyrophosphate in DMF, S^-de-O-^acctylation, cleavage of the Fmoc group, 
and final pun ficatioa by an ion-exchange chromatography on DEAE- 
Sephadex was performed and yields according to lA^-absorbance of the 
15 uracil moiety were 0.4} mmol 5-(3-^(N.glycyl)^amidopropynyM)-2'- 
deoxyuridine-5^-triphosphaie (84%), 0.43 mmot 5-(3-CN^&~aIanyl)- 
amidopropynyl-l).2*-deoxyundinc-5'>triphosphatc (75%) and 038 mmol 5> 
(3-{N-glycyl"giycyl).amidopropynyl-!)-^2^deoxyundline-5'-triphosphate 
(78%). Th^e mass-modified nucleoside triphosphates were incorporated 
20 during the Sanger DKA sequencing primer-extension reactions. 

When using 5-{3-amtnopropynyl)-2'3-dideoxyuridine as 
starting material and following an analogous reaction sequence the 
corresponding glycyK, gJycyl-glycyl-and 6-aianyl-2',3'-dideoxyuridine-5*- 
triphosphates were obtained in yields of 69%, 63% and 71%. nsspeciively. 
IS These mass-modified nucleoside triphosphates serve as chain-terminating 
niicleoiides during the Sanger DNA sequencing reactions. The mass^ 



91 

modified sequencing ladders are analyzed by either ES or MALDI mass 
spectrometry. 

Mass modificatiofi of tiiideofide triphosphates: 727 mg 

(1;0 mmoi) of N^'-{4^teit-buty]phenoxyacetyJ)-8"giycyl-5'^(4,4- 
5 d!methox>4ntyi)"2** deoxyadenosine or 800 mg (],0 mmoi) N^-{4-tert^ 

butyiphenoxyacetyl)-8-gjycyl-glycyI-^5'K4,4-dimethoxytrityI)-2^- 
deoxyadenosine prepared according to literature (Kftster et ai., Tetrahedron 
37:362, 1981) were acetylated with acetic anhydride in pyridine at the 3'- 
OH, detrit\'iated at the 5 '-position with 80% acetic acid in a one-pot reaction 

0 and transformed into the 5 '-triphosphates via phosphorylation with POCij 
and reaction in situ with tetra(tri-n-butylamnionium) pyrophosphate. 
Dqjrotection of die N« tert-^butylphenoxyacetyl, the 3'-0-acetyl and the O- 
methyl group at the gfycine residues was achieved with concentrated 
aqueous ammonia for ninety minutes at room temperature. Ainmonia w^s 

5 removed by lyophilization and the residue washed with dichforomethane. 
solvent removed by evaporaiion m vacuo and the remaining solid material 
purified by anion-exchange chromatography on DEAE-Scphadex using a 
linear gradient of triethyiammonium bicarbonate from 0.1 to 1.0 M. The 
nucleoside triphosphate coiitaining fractions (checked by TLC on 

0 polyethyleneiminc cellulose plates) were combined and lyophilized. Yield 
of the 8-giyc>^I-2'--deoxyadenosine-5*^triphosphate (determined by UV- 
absorbance of the adenine moiety) was 57% (0.57 mmoi). The yield for the 
8-glycyl-giycy]-2^deoxyadenosine-5'-tnphosphate was 51% (0.51 mmoi). 
These mass-modified nucleoside triphosphates were incorporated during 

5 primer-extension in the Sanger DNA sequencing reactions. 



92 

When using the corresponding N6-(4-tert- 
butylphenox>'acet>a)-8^glycy^ or -glycyl-gfycyl-5^-OK4,4^dimem 
2\3'-<iideoxyadenosine derivatives as starting materials (for the introduction 
Of tte 2\3 "fiinction: Seela et al, Helvetica Chimica Acta 74: 1048-58, 1 99 1 ). 
Using an analogous reaction sequence, the chain-terminating mass-modified 
nucleoside triphosphates 8-glycy]- and S-giycyl-gJycyl-2',3'- 
dideoxyadenosinc-5'-triphosphates ware obtained in 53 and 47% yields, 
respectively. The mass-modified sequencing Sragment ladders are analyzed 
by either ES or MALDI mass spectrometry . 

Example 19 Mass Modification of IsJii cleotides bv Alleviation After 5;ang <^^ 
Sequencing. 

2\3'-Dideoxythymidine-5'-(alpha"'S)-tr!phosphate was 
prepared according to published procedures (for the alpha^S-triphosphate 
moiety: Eckstein et aL, Biochemistry 15:1685, 1976) and Accounts Chem. 
Res. 12:204, 1978) and for the 2'3'-dideoxy moiety: Seela et al., Helvetica 
Chimica Acta 74:1048-58, 1991), Sanger DNA sequencing reactions 
employing 2''deo5£ythymidine-5^(alpha.S)-triphosphate are performed 
according to standard protocols. When using 2\3'-dideoxythymidine^5'- 
(a!pha-S)-triphosphates, this is used instead of the unmodified 2' J - 
dideoxyihymidine^S'-triphosphate in ^andard Sanger DNA sequencing. T\\e 
template (2 picomole) and the nucleic acid M13 sequencing primer (4 
picomole) are annealed by heating to 65 "C in 100 pi of !0 mM Tris-HCL 
pH 7.5, 10 mM MgClj, 50 mM NaCl. 7 mM dithiothreitol (DTT for 5 
minutes and slowly broughi to 37**C during a one hour period. The 
sequencing reaction mixtures contain, as exempiified for the T-speciftc 
termination reaction, in a final volume of 150 pi, 200 (final 



93 



concentration) each of dATP, dCTP, dTTP, 300 HM cT-deaza-dGlT, 5 
2',3'dideoxythymidme-5'-(alpha-S)-triphosphate and 40 units Sequenase. 
Polymerization is performed for 10 minutes at 37=C, the reaction mixture 

heated to 70»C to inactivate the Sequenase, ethanoi precipitated and coupled 
5 to thiolated Sequelon membrane disks (8 mm diameter). Alkyiation is 
performed by treating the disks with 10 pi of i 0 mM solution of either 2- 
iodocthanol or 3-iodopropanol in NMM (N-methylmoiphoIine/water/2- 
propanol, 2/49/49, v/v/v) (three times), washing with 10 ^il NMM (three 
times) and cleaving the alkylated T-ierminated primer-extension products 
10 off the support by treatment with DTT. Analysis of the mass-modified 
fragment families is performed with either ES or MALDI mass 
. spectrometT)'. 

Example 20 Mass Modification of Hn Qjjp,.,,,,^;^.^^^, 

This method, in addition to mass modification, also modifies 
15 the phosphate backbone of the nucleic acids to a non-ionic polar fonn. 
Oligonucleotides can be obtained by chemical synthesis or by enzymatic 
synthesis using DNA polymerases and a-thio nucleoside triphosphates. 

This reaction was performed using DMT-TpT as a starting 
material but the use of an oligonucleotide with an alpha thio group is also 
20 appropriate. For thiolation. 45 mg (0,05 mM) of compound 1 (Figure 1 5), 
is dissolved in 0.5 ml acetonitrile and thiolated in a 1.5 ml tube with 1.1- 
dio2o.I-H-benzo[I,2]dith!o-3-on (Beaucage reagent). The reaction was 
allow to proceed for 10 minutes and the produce is concentrated by thin 
layer chromatography with the solvent system dichIoromethane/96% 
5 ethanol/pyridine(S7%/13%/l«/o: v/v/v). The thiolated compound 2 (Figure 
15) is deprotected by treatment with a mixture of concenira.ed aqueous 



94 

ammonia/acetoniirile v/v) at room temperature. This reaction is 
monitored by thin layer chromatography and the quantitative removal of the 
beta-cyanoethyi group was accomplished in one hour. This reaction mixture 
was evaporated in vacua. 

5 To synthesize the S-{2-amino-2-oxyethyl)thiophosphate 

triester of DMT-TpT (compound 4), the foam obtained after evaporation of 
l^e reaction mixture {compwind 3) was dissolved in 0.3 ml 
acetonitrile/pyridine (5/1; v/v) and a 1.5 molar excess of iodoacetamide 
added. The reaction was complete in ID minutes and the precipitated salts 

1 0 were removed by centriaigation. The supernatant is lyophilized, dissolved 
in 0.3 ml acetonitriie and purified by preparative thin layer chromatography 
with a solution of dichlorometharie/96% ethanol (B5%/15%: v/v). Two 
fractions are obtained which contain one of the two diastereo isomers. The 
mo forms were separated by HPLC. 

5 Example 2 1 MAI.DI-MS Analy.^^^^ of a Mass-M odified Ohg onucleotide . 

A I7-mer was mass modified at C-5 of one or two 
deoxyiiridine moieties. 5-[13*(2-Methoxyethoxyj)-tri<iccyne-l -y[]-5 *-0- 
(4,4'-dimethoxytrityl)*2'"deoxyuridine-3*^p-cyanoethyt.N,N^ 
diisopropylphosphoamidite was used to synthesize the modified 17-mers. 



95 

The modified i were: 

X 

5 aOTAAAACCACGCCCAGUO) <mokc«lar mass: 5454} (SEQiD NO 30) 

X X 

r i 

^ ^ d (UAAAACGCGGCCAGUG) (™!«ular mass 5634) (SEQ IDNO 3 1) 

where X = -C*C-(CH3)^^^H 
funiTwdified 1 7.mer: molecular mass: 5273) 

The samples were prepared and 500 fmo] of each modified 1 7- 
15 mer was analyzed using MALDI-MS. Conditions used were reflectron 
positive ion mode with an acceleration of 5 kV and post^acceleration of 20 
kV. The MALOr-TOF spectra which were generated were superimposed 
and are shown in Figure 16. Thus, mass modification provides a distinction 
detectable by mass spectrometry which can be used to identify base 
20 sequence information. 

Example 22 CaptPrg and Scquencmp of a Douh I e- S t randed Tar^ ^ | N yr^^''- 
Acid . 

In another experiment, a nucleic acid was captured and 
25 sequenced by strand-^dispiacement poiymenzation. This reaction is shown 
schematically in Figure 17, Double-stranded DNA target was prepared by 
PGR and attached to magnetic beads as described in Example 6. EcoR I 
digested plasmid NB34 was used as the DNA template for amplification, 
NB34 comprises a PCRTm n plasmid (Invitrogen) with a one kb target 
30 human DNA insert, PCR was performed with an 16-nucieotide upstream 
primer (primer I, 5'-AACAGCTATTACCATG-3'; SEQ ID NO. 32), and a 
downstream 5 '-end bioiinylated IS-nucleoiide primer {primer li, 5"-biotin- 



96 

CTGAATTAGTCAGGTTGG-3'; SEQ ID NO. 33). Five hundred basepair 
PCR products, containing a single BstX I site, were immobilized by 
attachment to magnetic beads which were resuspended in a total of 300 ^1 
reaction buflfcr containing 200 units of BstX I restriction cndonuclease 
5 (Boehringer Mannheim; Indianapolis, IN). 50 mM Tris-HCi pH 7.5, 1 0 mM 
MgClj, 100 mM NaCl and I mM diftiodireitol. The mixture was incubated 
at 45 "C for three hours or until digestion was complete which was 
monitored by agarose gel electrophoresis. After digestion, magnetic beads 
were washed twice with 300 pi of TE to remove digested and non- 
10 immobilized fragments, excess nucleotides and restriction cndonuclease. 

This immobilized DNA was dephosphorylated by 
resuspending the beads in 100 m1 buffer (500 mM Tris-HCl, pH 9.0, 1 mM 
MgCla, 0.1 mMZnClj. and I mM spermidine) containing five units of calf 
intestinal alkaline phosphatase (Promega; Madison, WI). The reaction was 
15 incubation at 37"C for 15 minutes and at 56°C for 15 minutes. Five 
additional units of calf intestinal alkaline phosphatase was added and a 
second incubation was performed at 37°C for 1 5 minutes and at 56°C for 
1 5 minutes. Beads were washed twice widi TE and resuspended in 300 p I 
of fresh TE containing I MNaCl. 

Loading of the beads was checked by incubating 1 0 pi of the 
beadswith 10 pi of formamideat 9S-C forS minUtes (orby boiling in TE). 
The mixmre was analyzed by 1 % agarose gel electrophoresis with ethidium 
bromide staining. A 1 0 mI bead aliquot generally contains about 80 ng of 
immobilized double stranded DNA. 

A partial duplex DNA probe containing a four base 3' 
overhaig was used as a sequencing primer and was ligated with BsiX I 



97 

digested DNA fragments which were immobilized oa magnetic beads. The 
partial duplex had a 5'-fluorescein labeled 23 mer (DF25-5F) containing a 
5' base paling region and a 4-basc 3' single stranded region (avhich is 
complemcntaiy to the sequence of the 5 ■-protruding end of the 
5 con^pondrngBstXl digested target DNA as prepared above and a 19 mer 
(G-CMl) complementary to the base pairing region of the 23 mer. The 19 
mer was 5' phosphbrylated by the T4 DNA Polymerase and annealed f o the 
corresponding 23 mer in TE at (he same moiar ratio. Beads, prepared fh>m 
alkaline phosphatase treatment which have about 10 pmol immobilized 

10 DNA template, were ligated to 25 pmol of partially duplex probe in an 100 
ill volume containing 200 units of T4 DNA ligase (New England Biolabs; 
Beverly, MA), 50 mM Tris-HCI, pH 7.8, 10 mM MgCl,, 10 mM 
dithiolhreitol, 1 mM ATP. 25 jig/ml bovine serum albumin. Ligation 
reactions were performed at room temperature for two hours or 4°C . 

15 overnight. Beads were washed twice with TE and resuspended in 300 m1 of 
the same buffer. 

Sequencing reactions: Thirty pi of beads containing the 
ligation product were used for each sequencing reaction. Beads were 
resuspended in a 1 3 pi volume containing 1 .5 ^1 of 1 0 x Klenow buffer ( 1 00 

>0 mM Tris-HCl. pH 7.5, 50 mM MgCl,, and 75 mM dithiolhreitol) and with 
or without one (i! of single stranded DNA binding protein (SSB, 5 pg/pl; 
USB; Cleveland. Ohio). Mixtures were incubated on ice for 5 minutes 
followed with the addition of 5 units of Klenow Fra^ent (New England 
Biolabs). The reaction volume was split imo four termination mixes, each 

5 consisting of 1 pi DMSO and 3 pi of the appropriate tennination mixture. 



98 

Tennination mixtures were made in Kienow buffer and comprise the 
nucleotide concentrations shown below in Table I L 

Table I T 



10 



Tennination 


dATP 


dGTP 


dCTP 


dTTP 


1 ddNTPs 


Mix 


in mM 


in mM 


in mM 


in mM 




ddATP mix 


10 


100 


100 


100 


100 mM ddATP 


ddGTP mix 


100 


5 


100 


100 


120 mM ddGTP 


ddCTP mix 


J 00 


100 


10 


100 


lOOmMddCTP 


ddTTP mix 


100 


100 


iOO 


5 


500 mM ddTTP 



Termination mixtures were incubated for 20 minutes at 
ambient temperature. Two u\ of chase solution (0.5 mM of each of four 
dNTPs in Kienow buffer) were added to each reaction tube and mixtures 
were incubated for another 15 minutes, again at ambient temperature. 
15 Magnetic beads were precipitated with a magnetic particle concentrator (or 
ceniriftigation) and the supernatant discarded. Beads were re suspended in 
a solution containing 10 ^i of deionized fotmamide. 5 mg/ml dextran blue 
andO.1% SDS, and heated to 9S"C for 5 minutes, and stored on ice for less 
than 10 minutes. SampJes were analyzed on a DNA sequencing gel and on 
20 an ALF DNA sequencer (Pharmacia; Piscaiaway, NJ) using a 6% 
poiyacryJamide gel with 7 M urea and 0.6 x TBE. Sunjrisingly, sequencing 
reactions perfonncd in the presence of single-su^ded DNA binding protein 
showed considerable improvement in resolution. 6n]y 50 bases were 
resolved from reactions performed without single-stranded DNA binding 
25 protein (Figure 1 B. bonom panel) whereas 200 bases could be resolved from 



10 



99 

reactions performed in d.c presence of single-slranded DNA binding protein 
(Figure 18, top panel). 

Example 23 SEecifidJi^of_D^t^^ 
DiSpjacemeiit 

Another experiment was perfomed to determine the 
speafioily and applicability of the mck illation strand d.spiacemem 
method of sequencing double-stranded nucleic acids. A schen,at.c of the 
expenmenlal design is sho^ i„ Figure 19. Briefly, a double-stranded .trget 
DNA was prepared by digesting doubie-stranded *X174 phage DNA with 
7spR I restriction endonuclease. TspR I has a recognition site of 
NNCAGTGNN and cleaves *X.74 into 12 fi^gments each ,.dth distinctive 
y protrudmg ends. Possible ends are shown in Table 12. ' 

Table 12 



15 



20 



1 


5'^.AACACTGAC>3^' 


j 7 


5'"G7CAGTGTT-3' 


2 


5''.AACAGTGGA-3' 


1 8 




3 


5'-ACCACTGAC-3' 


9 


5'-GTCACTGAT"3' 


4 


5'-AACACTGGT-3' 


10 


5^*-TeCACTGTT-3' 


5 


5*-ATCAGTGAC-3' 


11 


5''TGCAGTGGA.3' ! 


6 


5'*ACCAGTGTT-3' 


12 


5'-TCCACTGCA.3' 



ox 174 DNA (5 pmol) was dephosphoo'lated using calf 
mtestn^al alkaline phosphatase. Bnefly, <t>X 1 74 DNA was resuspended in 
'00 pi buffer (500 mM Tris-HC, pH 9.0, 1 MgCI,. 0.1 ™m ZnCl. and 
I rnM spcnn.dit.e) con,a.n^g 5 units of calf intestinai alkaline phosphatase 
25 (Promega; Madison, Wl) The reaction was incubanon at 37=C for 15 
minutes and at 56»C for ,5 n,u,utes. Fn. additional un.ts of calf .«snr,al 



100 

alkaline phosphatase was added and a second incubation was performed at 
for 15 minutes and at 56"C for 15 minutes. DNA in the samples was 
extracted once with phenol, once with phenol/chloroform, and once with 
chiorofonn, after which nucleic acid was precipitated in 0.3 M sodium 
acetate/2.5 volumes ethane!. Precipitated <&X174 DNA was washed twice 
with TE and resuspended in 300 of TE contaiaing i M NaCi. 

Double-stranded probes, comprising biotin (B), fluorescein 
<T), and infra dye (C Y5) labels, were synthesized and anchored to magnetic 
beads as shown in Table 13. 



Table 13 



5 



0 



DF27-1 


ST-GATGATCCGACGCATCACATCAGTGAC-S^ 
l-B-CTACTAGGCTGCGTAGTG^p-S' 


(SEQ ID NO. 34) 
(SEQ n> NO. 35) 


DF27-2 


5'F-GATGATCCGAt;GCATCACTCCACTGTT.3' 
3B-CTACTAGGCTGCGTAGTG*p-S' 


(SEQ ID NO 36) 
(SEQm NO 37) 


DF27.3 


5T-GATGATCCGACGCATCACGTCAGTGTT-3' 
3'B-CTACTAGGCTGCGTAGTC^p^5^ 


(SEQ n> NO, 38) 
(SEQ ID NO 39) 


DF27-4 


5T-GATGATCCGACCCATCACTGCAGTCrGA^3' 


(SEQ ID NO. ^0) 
(SEQ TD NO. 4 1) 


DF27-5>CyS 


5'CY5"GATGATCCGACGCATCACGTCACTGAT-J' 
3B-CTACTAGGCTGCGTAGTG~p-5* 


(SEQ ID NO, 42) 

(SEQ ID NO 43) 


DF27.6-CYS 


5*CY5'GATGATCCGACG€ATCACAACAGTOGA-.r 
3B-CTACTAGGCTGCGTAGTG-p-S^ 


(SEQ ID NO. 44) i 
(SEQ ID NO. 45) 


DF27-7 


5*-F.GATGATCCGACGCATCACGTCAGTGGT-3' 
S'B -CTACTAGGCTGCGTAGTC -p . 5 ' 


(SEQ ID NO. 46) 

(SEQ VO NO. 47) 


DFll-S 


5*>F-OATGATCCGACGCATCA€A.^CACTOGT 3^ 
3B-CTACTAGGCTGCGTAGTG-P s 


(SEQ ID NO; 48) 
(SEQ ID NO 49) 


DF27-9 


5'-F-.GATCATCCCAGGGATCACAAGAGTGAC-3' 
3B^CJ ACTAGGGTCCCTAGTG-p-5' 


(SEQrONO. 50) 
(SEQ ID NO. 51) 


DF27-IO 


5'-F-GATGATCCGACGCATCACACCACTGAC-3' 
3'B-CT ACT AGGCTGCGT AGTO-p- 5' | 


(SEQ iD NO. 52) 
(SEQ ED NO. 53) 



Beads witli about 25 pmol of immobilized primer were ligaied 
to 3 pmol of digested TspR I <i>X J 74 D^NA m 50 ^1 containing 400 units of 
T4 DNA bgase (New England Biolabs; Beverly, MA), 50 mM Tns-HCi, pH 
7.8. lOmM MgCl;,,. 10 mM ditiiiotlireitol, 1 mM ATP and 25 )ig/m] bovine 
serum albumin. Ligation reactions were performed at BT'^C for 30 minutes, 
at 50*C to 55"C for one hour (thenma) ligase), at room temperature for 2 



102 

hours or at 4"C for overnight. After ligation, beads were washed twice \vith 
TE and resuspended in 300 fil of the same buffer. 

Sequencing reactions: For each sequencmg reaction, 30 ^1 of 
beads containing the ligation product was used. Beads were resuspended in 
a 13 |ii volume containing L5 fil of 10 xKlenow buffer{100 mM Tris-HCI, 
pH 7.5, 50 mM MgCij and 75 mM dithiothreitol), and with or without 1 ^1 
of single-stranded DNA binding protein (SSB, 5 pg/^l; USB; Cleveland, 
Ohio), Reaction mixtures were incut^ted on ice for 5 minutes, followed by 
the additicm of 5 units of KIcnow Fragment (New England Biolabs). The 
reaction volume was split into four termkiation mixes, each consisting of 1 
111 DMSO plus 3 ii\ of the appropriate termination mix. Tennination mixes 
were made in Klenow buffer comprise the nucleotides concentrations 
shown in Table 1 1 . 

Termination mixtures were incubated for 20 minutes ai 
ambient temperature. Two ^1 of a chase solution containing 0,5 mM of each 
of the four dNTPs in Klenow buffer, was added to each reaction tube and 
mixtures were incubated for another 15 minutes at ambient temperature. 
Beads were precipitated by magnetic particle concentrator or centrifugalion 
and the supernatant discarded. Precipitated beads were resuspended in TE 
or in asoiution containing 1 0 deionized fomiamide, 5 mg/ml dextran blue 
and 0A% SDS, and heated to 95 *C for 5 minutes. Mixtures were stored on 
ice for less than 10 minutes and analyzed by a DNA sequencing gel and on 
an ALF DNA sequencer (Pharmacia; Piscataway, NJ) using a 6% 
polyacryiamide gel with 7 M urea and 0.6 x TBE. 

One double stranded primer was used for each reaction and the 
results achieved using primers DF27-L DF27^2, DF27-4. DF27-5-CY5 and 



}Q3 

DF27^6-CY5. are shown in Figures 20/21, 22, 23 and 24, respectively. 
Bach primer was capable of generating sequencing infonnation of up to 200 
basepairs without significant interference from the 1 1 J&agments with non- 
compLementary ends. 

Other embodiments and uses of the invention will be apparent 
to those skilled in the art from consideration of the specification and practice 
of the invention disclosed herein. Ail U.S. Patents and other references 
noted herein are specifically incorporated by reference. The specification 
and examples should be considered exemplary only with the true scope and 
spirit of the invention indicated by the following claims. 



104 

The claims defining the invendoa arc as follows: 

1 . A method for sequencing a target nucleic acid, comprising the steps of: 

(a) providing 

(i) a of nucleic acid fragments, wherein each jBragment contains a 
sequence that ccnresponds to a sequence of the target nUQleic acid, and 

(u) m array of nucleic acid probes, wherein each, probe comprises a single- 
stranded portion cotnpnsing a variable region; 

(b) hybridizing the set of nucleic acid fragments to the array of nuctcic acid 
probes to form a target array of nucleic acids; and 

(c) determining molecular weights of nucleic acids in the target array to 
identify hybrids and thereby determine the sequence of the target nucleic acid. 

2. A method for sequencing a target nucleic acid, comprising the steps of; 
(a) providing 

(i) a set of nucleic acid fra^ents, wherein each fragment contains a 
sequence that corresponds to a sequence of the target nucleic acid, and 

(ii) an array of nucleic acid probes, wherein each probe, comprises a 
single-stranded portion comprising a variable region; 

0>) hybridizing the set of nucleic acid fragments to the array of nucleic acid 
probes to form a target array of nucleic acids; 

(c) enzymatically extending the nucleic acid probes of the target array using 
the hybridized target nucleic acid as a template to form extended strands; and 

(d) determining molecular weights of the extended strands^ whereby the 
sequence of the target nucleic acid is determined. 

3- A method of detecting a target nucleic acid, comprising the steps of: 

(a) providing 

(i) a set of nucleic acid fra^ents» wherein each fragment contains a 
sequence that corresponds to a sequence of the target nucleic acid» and 

(ii) an array of micieic acid probes, wherein each probe comprises a 
single-stranded portion comprising a variable region; 

(b) hybridizing the set of nucleic acid fragments to the array of nucleic acid 
probes to form a target array of nucleic acids^ and 

(c) determining molecular weights for nucleic acids of the target array, 
whereby the target nucleic acid is detected. 

4. A method for sequencing a target nucleic acid, comprising the steps of: 
(a) providing 

[l:\Dayijh\J.IBZZ^5456^oc:mrT 



105 



(i) a set of partially single-stranded nucleic acid fragments, whsrein 
each fragment contains a sequence Ihat corresponds to a sequence of the target nuctcic 
acid, 

(ii) an array of nucleic acid probes, wherein each probe comprises a 
single-stranded portion comprising a variable region and a double-sferanded portion; 

(b) hybridizing the singie-siranded portions of the fragments to single- 
stranded portions of the array of nucleic acid probes; 

(c) ligating single strands of the fragments to adjacent single strands of the 

probes; 

(d) extending the unligated strands using the ligated strand as a template; 

and 

(e) detenBining \hc molecular weights of the extended strands, whereby the 
sequence of the target nucleic acid is determined. 

5* A method for identifying a target nucleic acid sequence in a mixture 
containing a plurality of differ«it nucleic acid sequences, comprising the steps of: 

(a) treating the nucleic acids to create partially single-stranded, paztiaHy 
double-stranded nucleic acid fragments; 

(b) hybridizing the single-stranded portions of the fragments to single- 
sd-anded portions of probes comprising a single-stranded portion comprising a variable 
region, and a partially double-stranded portion; 

(c) ligating single strands of the fragments to adjacent single strands of the 

probes; 

(d) extending the unligated strands using the ligated strand as a template; 

(e) determining the molecu lar weights of the extended strands; and 

(f) identifying a target nucleic acid sequence by the molecular weight of 
the extended strands. 

6. The method of any one of claims 1, 2, 3, 4 or 5, wherein the molecular 
weights are determined by methods selected from the group consisting of gel 
electrophoresis, capillary electiophoresis, chromatography, and nuclear magnetic 

resonance- 

7. riie method of any one of claims I, 2, 3, 4 or S, wherein the molecular 
weights are determined by mass spectrometry. 

8. The metliod of claim 7» wherein the mass spectrometry comprises a step 
lected from the group consisting of laser heating, droplet release, electrical reiease» 



(l:\DayLib\LlB 2^]03456.doc:nirr 



106 



photochemical release, fast atom bombardment, plasma desorption, matrix-assisted laser 
desorption/jomzation, electrospray, and resonance ionization, or a combination thereof. 

9. The method of claim 7^ wherein the mass spectrometry comprises a step 
selected from the group consiMing of Fourier Transform, ion cyclotnDn resonaTice, time of 

s flight analysis with reflection, time of flight aitaly&is without reflection, and quadrupole 
analysis, or a combination thereof. 

10. The method of claim 7, wherein the mass spectrometry comprises matrix- 
assisted desoFption ionization and time of flight analysis. 

11. The method of claim 7, wherein the mass spectrometry comprises 
10 electrospray ionization and quadrupole analysis. 

12. The method of ciaim 7, wheicin two or more molecular weights arc 
determined simultaneously. 

13. The method of any one of claims U 2* 3, 4 or 5» wherein Uie nucleic acid 
fragments comprise at least one mass-modifying functionality, 

15 14. . The method of any one of claims 1» 2, 3, 4 or 5, wherein the nucteic acid 

probes comprise at bast one mass-modifying fiinctioriality. 

f 5. The method of any one of claims 2, 4 or 5, wherein the step of extending is 
perfomned in the presence of chain elongating nucleotides and chain terminating 
nucleotides. 

20 1 6. The method of claim 15^ wherein the chain elongating nucleotides comprise at 

least one mass-modifying functionality. 

17, The method of claim 15^ wherein the chain terminating nucleotides comprise 
at least one mass-modifying functionality. 

IS. The method of any one of claims 2, 4 or 5, vv^rein the extended strands 
25 comprise at least one mass-modifying fimctionality. 

19. The method of any one of claims 13, 14^ 16, 17, or 18, wherein the mass- 
modifying functionality is coupled to a heterocyclic base, a sugar moiety or a phosphate 
group, 

20. The method of any one of claims 13, 14, 16, 17, or 18, wherein the mass- 
jo modifying functionality is a chemical moiety that does not interfere with hydrogen 

bonding for base-pair formation. 

2L The method of any one of claims 13, 14, 16, 17, or 18» wherein the mass- 
modifying functionality is couipled to a purine at position C2, N3, N7» or C8. 

22. The method of any one of claims 13, 14, 16, 17, or 18, wherein the mass- 
ifying functionality is coupled to a deazapurine at position N7 or C9. 

|r:\Daytjb\LI8ZZ]05456.doCTWT 




107 



23. The method of any one of claims .13, 14, 16, .17, or IS, wherein the mass- 
modifying functionality is coupled to a pyrimidinc at position C5 or C6. 

24. The method of any one of claims 13, 14, 16, 17, or 18, wherein the mass- 
modifying functionality is selected from the group consisting of F, CI, Br, I, SiR, 

Sj{CH3)3, Si(CHj)2(C2H5), Si(CH3)2(C2H5)2/ Si(CH3)(C2H5)2, Si{C2H5)3, (CH2)nCH3, 

(CH2)ttNR, CH2COKR, {CH2)nOH, CH2F, CHFj, and CF3; wherdn n is an integer and R 
is selected from the group consisting of -H, deuteriura and aikyls, alkoxys and aryls of l-<5 
carbon atoms, polyoxymethySeae, monoalkylated polyoxymethyiene, polyethylene imine, 
polyamidc, polyester, alkylated silyl, hcterooUgo/polyaminoacid and polyethylene glycol. 

25. The method of any oii^ of claims 13, 14, 16, 17, or 18, wherein the mass- 
modifying functionality is generated from a precursor functionality which is -N3 or -XR» 
wherein X is selected from the group consisting of -OH, -NHi* -NHR, -SH, -NCS, - 
OCO(CH2)nCOOH, -'NHCO(CH2)nC0OH, -OSOjOH, -OCOCCH^) J, and -OPCO-alkyl)- 
N-(alkyl)2> and n is an integer from 1 to 20; and R is selected firom the group consisting of 
-H, deuterium and alkyls, alkoxys and aryis of 1«6 carbon atoms, polyoxymcthylcnc, 
monoalkylated polyoxymethylene, polyethylene imine, polyamide, poiyester, alkylated 
silyl, heterooligo/polyaminoacid and polyethylene glycol. 

26. The method of any one of claims 13, 14, 16, 17^ or 18, wherein the mass- 
modifying functionality is a thiol moiety. 

27. TTie method of claim 26, wherein the thiol moiety is generated by using 
Beucage reagent. 

28. The metliod of any one of claims 13, 14, 16^ 17j or 18, wherein the ma^s- 
modifying functionality is an alkyi moiety. 

29. The method of claim 28, wherein the alkyl moiety is generated by using 
iodoacetamide. 

30. The method of any one of claims 1, 2, 3, 4 or 5, comprising the slop of 
removing alkali cations. 

3L The method of claim 30, wherein the alkali cations are removed by ion 
exchange. 

32, The mediod of claim 31, wherein the ion exchange comprises conlacting the 
nucleic acid with a solution selected from the group consisting of ammonium acetate, 
ammonium carbonate, diammonium hydrogen citrate, and ammonium tartrate, or 
combinations thereof 

33, The method of any one of claims i , 2, or 3, comprising the stq> of hgating the 
ybridized target nucleic acids to the probes. 



f[At5ayLtb\LIBZZlD5456,dc»C3m 



^108 



34. The method of any one of claims 1, 2, 3, 4 or 5, wherein the target nucleic 
acid is provided from a biological sample. 

35. The method of claim 34^ wherein the biological sample is obtained from a 



36. The mctiiod of any one of claims 1, 2, 3, 4 or 5, wherein the target nucleic 



37. The method of any one of claims 1, 2, 3, 4 or 5, where the target nucleic acid 
is between about 10 to about UOOO nucleotides in length. 

38. The method of any one of claims 1, 2, 3,. 4 or 5, where the nucleic acid 
fragments are between about lO to about 1,000 nucleotides in length, 

39. The method of any one of claims 1 , 2, 3, 4 or 5, wherein each sequence of the 
nucleic acid fragments is homologous with at least a portion of the sequence of the target 
nucleic acid, 

40. The method of any one of claims 1, 2, or 3, wherdn each sequence of the set 
of nucleic acid fragments is complementary with at least a portion of the sequence of the 
target nucleic acid. 

41. The method of any one of claims 1, 2^ 3, 4 or 5, comprising the step of 
dephosphorylating the nucleic acid fragments by treatment with a phosphatase prior to 
hybridization. 

42. The method of any one of claims 1, 2, 3, 4 or 5, wherein the fragments are 
provided by enzymatic digestion of the target nucleic acid, 

43. The method of claim 42, wherein the enzymatic digestion is carried out by a 
nuclease, 

44. . The method of any one of claims 1» 2, 3, 4 or 5, wherein the nucleic acid 
fragments are provided by physically cleaving the target nucleic acid. 

45- The method of any one of claims 1, 2, or 3, wherein the nucleic acid 
fragments are provided by enzymatic polymerization of the tai^et nucleic acid. 

46. The method of claim 45, wherein the enzymatic polymerization is a nucleic 
acid amplification process selected from the group consisting of strand displacement 
amplification, ligase chaia reaction, replicase amplification, 3SR amplification, and 
pol^mfierase chain reaction. 

47. The method of claim 45, wherein the enzymatic polymerization is cm-ied out 
in the presence of clmin elongating nucleotides and chain terminating nucleotides. 



48, The method of any one of claims 1, 2, or 3, wherein the nucleic acid 
ragments arc provided by synthesizing a complementary copy of the target sequence. 



patient. 



acid is provided from a recombinant source. 




f l:\DayiJb\IJB2Z105456 docnrtrr 



109 



49. The method of any one of claims 1, 2, or 3, wherein the nix^leic acid 
fragments comprise a nested set 

50. The method of any one of claims 1, 2, 3, 4» or 5, wherein the nucleic acid 
fragments comprise DNA. RNA, PNA or combinations thereof. 

5 51. The method of any one of claims U 2, 3, 4, or 5» wherein the target nucleic 

acid comprises DNA, RNA, PNA or modifications of combinations thereof, 

52. The method of anyone of claims 1, 2, or 3, wherein the fragments of nucleic 
acids comprise greater than about 10"* different members and each member is between 
about 10 to about 1 ,000 nucleotides in length, 
to 53. The method of any one of claims 1, 2, or 3, wherein the probes are single- 

stranded, 

54. The method of any one of claims 1, 2 or 3, wherein the probes comprise a 
double-stranded portion and a single-stranded portion. 

55. The method of any one of claims U 2, 3, 4 or 5, wherein the array comprises a 
IS collection of probes with sufficient sequence diversity in the variable regiom: hybridize 

all of the target sequence with complete or nearly complete discrimination, 

56. The method of any one of claims I, 2, 3, 4 or 5, wherein the probes have a 
single-stranded region at one terminus and a double-stranded region at the opposite 
terminus. 

20 57, The method of any one of claims 1, 2, 3, 4 or 5, wherein the probes are about 

LO to about 1 ,000 nucleotides in length. 

5S. The method of any one of claims 1, 2, 3, 4 or 5, wherein the probes are about 
15 to about 200 nucleotides in length. 

59. Tbs method of any one of claims U 2, 3, 4 or 5, wherein the probes are about 
25 LO to 50 nucleotides in length. 

60. The method of any one of claims 1 , 2, 3^ 4 or 5, wherein the doubie-stranded 
portion is about 4 to about 30 nucleotides in length, 

61. The mefliod of any one of claims 1, 2, 3, 4 or 5, wherein the single-stranded 
portion is about 4 to about 20 nucleotides in length. 

62. The method of any one of claims K 2, 3, 4 or 5, wherein the variable region is 
about 4 to about 20 nucleotides m length. 

"Hic method of any one of claims 1 , 2, 3, 4 or 5, wherein the array of nucleic 

jN^^^^^^^^probes is attached to a solid support. 



frADayLibNUDZZlOS^e-docrnuT 



no 



64. The method of claim 61 , wherein the solid support is selected from the group 
consisting of plates, beads, microbeads, whiskers, combs, hybridization chips, 
membranes, single crystals, ceramics, and self-assembhng monolayers. 

65. The method of claim 63, wherein the probes are conjugated with biotin or a 
biotin derivative and wherein the solid support is conjugated with avidin, streptavidin or a 
derivative thereof, 

66. The method of claim 63, wherein each probe is attached to the solid support 
by a bond selected from the group consisting of covalent bond, electrostatic bond, 
hydrogen bond, cleavable bond* photocleavable bond, disulfide bond, peptide bond, 
diester bond, and selectively releasable bond, or a combination thereof. 

67. The method of claim 66. wherein the cleavable bond is cleaved by a cleaving 
agent selected from the group consisting of heat, an enzyme, a chemical agent, and 
electromagnetic radiation, or a combination thereof. 

68. The method of claim 67, wlierein the chemical agent is- selected Gnmi the 
group consisting of reducing agents, oxidizing agents, and hydtolyzing agents, or a 
combination thereof. 

69. The method of claim 67, wherein the electromagnetic radiation is selected 
&om the group consisting of visible radiation^ ultraviolet radiation, and infrared radiation. 

70. The method of claim 66, wherein the selectively releasabie bond is 4, 4*- 
dimethoxytrityl or a derivative thereof 

7L. The method of claim 70, wherein the derivative is selected from the group 
consistmg of 3 or 4 [biS'(4-methoxyphenyI)3-methyl-benzoic acid, N-$uccinimidyl~3 or 4 
[bis-{4Hmethox>phenyl)hmethyl-ben2oic acid, N-succinimidyI'3 or 4 [bis-{4- 
methoxyphenyl)]-hydroxymethyl-benzoic acid, N-suocinimidyl-3 or 4 [bis-(4^ 
methoxyphenyl)]-chloroniethyl'benzoic acid and salts thereof 

72. The method of claim 63, comprising a spacer between each probe and the 
solid support. 

73. The method of claim 72, wherein the spacer is selected fom the group 
consisting of oligopeptides, oligoaucleotides, oligopolyamides, oligoethyleneglycerol, 
oligoacryiamidcB, and alkyl chains of between about 6 to about 20 carbon atoms, or 
combinations thereof. 

74. The method of claim 63, wherein the solid support comprises a matrix th^ 
facilitates volatilization of nucleic acids for molecular weight delBTmination. 

75. The method of any one of claims 1, 2, 3, 4 or 5, wherein the nucleic acid 
probes comprise DNA, RNA, PNA, or combinations thereof, 

|l:SDayUbSUaZZ]0!M56.<Iiic:rnn- 



ill 



76* The method of claim 2, comprising the step of: 

ligaling a smgle strand of the fragment to the probe; 

wherein the step of extending is by strand displacement polymerization using 
the li gated strand as a template, 
s 77, The method of claim 2^ wherein the extended strands comprise DNA^ RNA, 

PNA or combinations thereof 

78- The method of claim 3» wherein the detccuon of the target is indicative of a 
disorder in a patient. 

79. The method of claim 78^ wherein the disorder is selected from the group 
to consisting of ^netic defect, neoplasm, and infection, 

80. The method of c!ahn 5, wherein the single-stranded portion of Ihe probes 
comprises a variable region. 

81. The method of any one of claims 1, 2, 3, 4 or 5, wherein the single-stranded 
portion of the probes comprise a constant region. 

15 82. The method of claim 4 or 5, wherein extension of the unligated strands 

proceeds in the presence of chain-tcrminating nucleotides. 

83. A method for sequencing a target nucleic acid, substantially as hereinbefore 
described with reference to any one of the Exampics. 

84. A method for detecting a target nucJeic acid* substantially as hereinbefore 
20 described with reference to any one of the Examples- 

S5. A method for d:^ntifying a target nucleic acid sequence in a mixture 
containing a plurality of <:U l^^rent nucleic acid sequences, substantially as hereinbefore 
described with reference to any one of the Examples. 

86. The method of any one of claims 1, 2, 3, 4 or 5, wherein the array comprises 
15 less than or equal to about 4^ different probes and R is the length in nucleotides of the 
variable region. 

Dated 20 January, 2003 
Trustees of Boston University 
Sequehom, Inc. 



Patent Attorneys for the Appiicant/Nominated Person 
SPRUSON & FERGUSON 



[ [:\DayUb\LfBZZl05456<Joc:mrT 



\ ■ 



* • * 

* 

• • * 




m3 




2w7 



0= P~0-CH2 



m5 m4 



^ ■ 



FIG, lA 



n- 1-50 

M=H,OH,XR, 
Halogen ,N3 



BMSDCKIO: <AU., 



2/34 



in 

CM 


o 

X 

X X c£ - S 
X o x O X X o 

«rr 3^ 

0 o o O O X 

■ o . o ■■- 
X X 

1 ; ;S ai X X X X ■ 

• 2: 

X X ffi X X X 
o o S X o o o 




S< S< ,r < -J, ii 2:< 
-ca 

3^ =1 H| ^1 2|. 

S^i §L^ £i S.1 S.'l S.| S.| 
Si© So Sg .^S 
t2*H J-S t^S 



(3iS(SDOClD: <AU 75845482 J. > 



3 / U 




U 

2 



' BNSDOCID: <AU 758464B2J.„> 





m2 


m3 






Type A (DNA- 










Terminalion) 


XR 


OH 


H 


H 


Type B (DNA- 








Terminotion ) 


H 


OH 


H 


XR 


Type C (DNA- 






Terminotion } 


H 


XR 


H 


H 


Type D (RNA- 






Tarmination) 


XR 


OH 


OH 


H 


Type E (RNA- 








Termlnolion) 


H 


OH 


OH 


XR 


Type F IRMA- 








Termlnation) 


H 


XR 


OH 


H 



FIG. 2B 



BNSDOCID: <AU _75S454B2 l..^ 



5/3/i 



-O- -^CH2CH20)f„-CH2CH2-C»^ 

or -teHgCHgOj^-CHgCHg-O-Atkyl 
-0-C-(CH2)rC-0- 'tCH2^i^z^^m'^^z'^^Z~ ^ 

O Q 0 or -{CH2CH20)„j-CH2CH2-0-Aikyl 

-NH-C- /- C-NH- -£CH2CH20)n,-CH2CH2-0H 

0 or -tCH2CH20)rii-CH2CH2-0-Alkyl 

-NH-C-(CH2)r-§-0- -(CHgCHgOrt-CHgCHs-OH 

° ° or -(CH2CH20}n,-CH2CH2-0-Alkyl 

-NH-C-NH- -(CH2CH20)rn-CH2CH2-OH 

S or -{CH2CH20)^-CH2CH2-0-AIkyl 

-0-P-O'AIkyi -(CH2CH20)^-CH2CH2-OH 

or -(CH2CH20)rn-CH2CH2-0-Alkyi 
-O-SOg-O- -(CH2CH20}n,-CH2CH2-OH 

or -{CH2CH20)^CH2CH2-0-Arkyi 
-O-C-CH2-S- -(CH2CH20'nrCH2CH2-0H 

or -(CH2CH20)n^H2CH2-0-Alkyi 

-{CH2CH20)riTCH2CH2- OH 
or -{C>l2CH20)ri5CH2CH2-0-Alkyl 

-(CH2CH20)^CH2CH2-OH 
or -(CH2CH20)niCH2CH2-D-AlkyI 

-{CH2CH20)riiCH2CH2-OH 
or -(CH2CK20)rsCH2CH2-0-Alky! 



0 



0 

-NH- 



m= O, f-200 
r « I - 20 . 



FIG. 3 



6/34 



-H 



end branched e.g.-^CH(CH3J2 
ICH2(CH2)r-0-H 

2,3- Epoxy- J- proponol 

-{CH2)m-CH2-0-H 

-«^H2)^-CH2-0-AtkyI 

- {CH2 CH2 NH)^- GHgCHg-NHg 

- jNH-{CH2V- NH-C-(CH2)r-C-] „r^H-(CH2)fNH-C-iCH2)fC-OH 

-[NH-(CH2)r-C-ljNH-(CH2V-C-OH ° ° 

0 r\ 
- jNH-CHy-Cj ^- NH-CHY-C-OH 

- [0- (CHgV-C-J^-p-tCHaf- ^-OH 

.s> o o 

-Si{AJky|)3 
-Hobgen 

-CH2F,-CHF2,-CF3 



m = 0, 1 - aoo 

r . 1-20 



FIG. 4 



BNSDOCID: <^AU 758454B2....L> 



TAR Q ET 

I 

■GGNNfNiNNNNGC 

-CGNNNNNNNCG- 
i 

TARGET 

Fia 5 



3*NNGTCACNN- 



PROBE 



3* 5* 
NNCAGTNN^^ — 



3'NNGTCACNN> 



PROBE ^ TARGET 



LIGATION 



DNA POLYMERASE 



TARGET 



5'- 
3'' 



NNCAGTGNN- 
NNGTCACNN" 



TARGET 



FIG. 6 



8/3^ 



NUCLEIC ACID 

STRUCTURE CALCULATED T^ CO. AVERAGE BASE COMPOSITION) 

8 7 6 6 



38 33 25 16 

33 25 15 3 

25 15 3 H4 

5* 46 40 31 

46 40 31 2f 

40 31 21 ri 



FIG. 7 



9/ 




MASTER 
ARRAY 



mcmm with biotinyuted 

OHJtPLaiENTARY STRAND 



ttv 



SYNTHESIS OF 
COMPLEMEHTARY ARRAY 



CONTACT ABOVE T„ 



stv 



k - 



b 

y 5' 



streptavioim-coated filter or 

PLATE ARRAY COMPRESSED BY 

orreETTWG 



INCUBATE WITH 
COMPLEMENTARY STRANDS 



FULL, FINISHED AJ^AY 



FIG. 8 



pal I 
dNTPs 



stv 



i 

3' 5' 



stv 

5' 3' 

h 





BNSDCXDiD; <AU„ 



,._7SB4S4B2J_?- 




BNSDOC3D: <AU. 



.75e464B2J...> 



!■ (■ 

11/36 




• BIOTIM 

FIG. 10 a STREPTAVIDIN 

H SURFACE 



BNSDOCID: <AU. 



.75B4S4Ba..L> 




8NSOOCtD: <AU. 



758454B2._I_> 



t i 



. o 
to 



o 

e> 



o 



to 



a. 

t/1 



O 



-Q 



SNSDOCID: <AU_ 



CD 



CO 

ro 



BNSDOCiD: <:AU ^ 75a454B2_L..> 




BNSDOCiD: <AU^ 



75e454B2J..> 




BNSDOCID: <AU. 



,758454B2.J....> 



17/ 34 




^SDOCiDi <AU 



75S454SS_.L> 



19/34 




1. THIOLATIQN 



DMT-0-, n T 
1 ^Lev 



0 



NC 
2 



0 

^Lev 



NC 



OH 



'15 



'NH2 



DMT : "DiMETHOXYTRITYL-" 

BIS(4-METH0XYPHENYL) 
PHENYLMETHYL- CH. 0 



T ;THYM1N 



"^".r^VOj 3. ALKYIATIQM 0^s_^.n 

^ NH2 6 v-^ 

3 OH 
- 4 




OCH^ 



Lev :"LAEVULIWL-" 
4-OXOPENIANOUL 



FIG. (5 




FIG. 16 



BNSDOCfD: <AU. 



21 /34 



BLOCKING 



PGR 



BstXJ DIGEST 

CCANNNNNNTGG 
GGTNNNNNhiAOC 



OYNAL STREPTAVIOIN BEADS 



NICK SITE 



.1 



RESTRICTION DIGESTION 



OEPHOSPHORYlATiON 
LIGATION TO PARTIALLY 
DUPLEX PROBE 



I 



I 



DNA POLYMERASE (LARGE FRAGMENT) 
ddNTP TERMINATION MIX 



5' -F-GATGATCCGACGCATCAGCTGIG - 
3' -CTACTAGGCTGCGTAGTCGpACAC 



FIG. 17 




BNSDCXJfD: <AU.. 



.7584S482..J..> 




BNfSDOCID: <AU. 



7Se454B2_L> 



24/ 3/. 




RESTRICTION OrGESTfON BY TspRI 
-^GENERATING 6 FRAGMcNTS WITH TOTAL 
OF 12 DIFFERENT ENDS 



DEPHOSPHORYLATION 
ONE KIND iMMOBILIZED PARTIALLY 

DUPLEX pRoeE IS mm to the 

MIXTURE OF FRAGMEMTS AND LIGATION 
REACnON ]S PERFORMED 



NicK snE 



DNA POLYMERASE (LARGE FRAGMENT) 
ddNTP TERMINATION MIX 



SEQUENCIf^G PRODUCTS ARE APPLIED TO ALF DKA SEQUENCER 



FIG. 19 



I 



BNSOOCiD: <AU 



.7^454a2j_> 




I^SDOCID: <AU 



..7Be464B2.J„> 




SvlSDCXIiD: <AU 75e454B2_L,> 




BNSDCKJiDr <AU. 



.75S454B2„L> 




BMStX)CJD: <AU. 




,75845482 L> 



31 




BNSDOCID: «AU 7SB4S4a8J_» 



32/34 




SNSDOCiD: <AU 



768464B2 I > 




BNSDOCID: <AU 7S&454B2J. > 




BNSDOCID; <AU. 



.758454B2.„1.„> 



